Chat on WhatsApp
Company Logo

Senior Site Reliability Engineer

Rp20.000.000 - 35.000.000/Bulan
Penuh Waktu · Hybrid
Minimal Sarjana (S1)
5 - 10 tahun pengalaman

Persyaratan

Hybrid
5 - 10 tahun pengalaman
Minimal Sarjana (S1)

Skills

Prometheus

observability

Database Systems

Grafana

GitOps

CI/CD

Networking

Kubernetes

Infrastucture As Code

Docker

Cloud Platform System

Kafka Operations

Benefit Kerja

Gym membership discount

Device service support

Annual salary increment

Private Insurance

F&B subscription discount

Wellness & birthday leave

Hybrid work arrangement

Laptop ownership program

Loker ini dikelola oleh

RR
Rian Rosidi

Deskripsi pekerjaan Senior Site Reliability Engineer Reku

We are looking for an experienced Senior Site Reliability Engineer to join our Engineering team. You will work closely with the team and develop software systems and automated solutions for the operational aspects of an organization. You will also be responsible for monitoring computer systems and building alerts for various operational issues that computer systems can experience.

What will you do:

Reliability & Incident Management

  • Define and enforce SLOs, SLIs, and error budgets for trading, payment, KYC, and notification services
  • Own the on-call rotation structure, runbooks, and postmortem culture
  • Lead incident response for P0/P1 issues; coordinate across backend, mobile, and compliance teams
  • Reduce MTTR through better alerting in Coralogix, auto-remediation, and progressive delivery
  • Drive blameless postmortems and ensure action items are tracked to closure in Linear

Infrastructure & Platform

  • Own GKE clusters (dev, staging, production), ArgoCD GitOps pipelines
  • Harden the Caddy API gateway: routing, rate limiting, TLS termination, canary weighted_round_robin
  • Drive migration off legacy infrastructure: Docker Swarm, Consul-based service discovery cleanup
  • Lead capacity planning, GCP cost optimization, and multi-AZ / disaster recovery strategy
  • Own IaC for all infrastructure (Terraform / Helm / Kustomize)

Observability & Developer Experience

  • Expand Coralogix coverage across logs, traces, alerts, and dashboards; enforce structured logging standards
  • Improve deploy workflows
  • Maintain Telepresence setups, test clusters, and developer self-service tooling
  • Close the feedback loop between alerts, Linear tickets, and engineering fixes

Security & Compliance

  • Own the infrastructure for secrets management, key rotation, and production access controls (the platforms and workflows, with policy set by Security)
  • Harden the CI/CD supply chain and container runtime posture — image provenance, signing, base image hygiene, build isolation

Data & Streaming

  • Own Kafka topology hygiene; plan and execute staging/production isolation
  • Ensure Debezium CDC pipeline reliability and lag monitoring
  • Partner with backend on SQL migration safety (Bytebase, gh-ost) for online schema changes
  • Define and enforce database operational standards (backups, replication, failover drills)

Mentorship & Culture

  • Level up backend, mobile, and frontend engineers on operational thinking
  • Establish and run quarterly DR / game-day exercises
  • Contribute to engineering documentation, design reviews, and architectural decisions

What we are looking for:

Required Qualifications

  • 7+ years of SRE / DevOps / Platform Engineering experience, with 3+ years leading reliability for production systems at scale
  • Proven track record owning production systems in a financial services, fintech, or high-availability environment
  • Experience leading P0/P1 incidents as incident commander and driving systemic improvements from postmortems

Required Technical Skills

  • Kubernetes (expert)— GKE preferred; multi-cluster, workload identity, network policies, HPA/VPA, node pool design
  • GitOps— deep experience with ArgoCD (or Flux) and Helm / Kustomize
  • Kafka operations — brokers, consumer groups, partition rebalancing, lag monitoring; RedPanda or Confluent a plus
  • Cloud platforms — GCP preferred (Artifact Registry, Cloud SQL, VPC, IAM, Cloud Logging); AWS / Azure transferable
  • Observability — hands-on with Coralogix, Datadog, Grafana, or equivalent; SLO engineering and alert quality, not just dashboards
  • Infrastructure as Code — Terraform or Pulumi; strong YAML / Helm templating - Programming & scripting — Go, Python, or Bash to production quality
  • CI/CD — GitHub Actions, container build pipelines, artifact promotion flows
  • Networking — TCP/IP, DNS, TLS, load balancers, reverse proxies (Caddy, Envoy, nginx)
  • Databases — PostgreSQL operations (replication, failover, connection pooling), Redis, online schema migrations (gh-ost, Bytebase, pt-online-schema-change)

Nice-to-Have Skills

  • Track record using AI coding assistants to accelerate platform work — automating ops tasks, generating IaC, triaging incidents
  • Crypto exchange / trading / low-latency systems experience (order flow, market data, wallet/custodian risk)
  • Regulated environment experience (OJK, MAS, SEC, or similar financial regulators)
  • Service mesh (Istio, Linkerd); eBPF tooling (Cilium, Pixie)
  • Chaos engineering (Litmus, Chaos Mesh)
  • FinOps / cloud cost optimization track record
  • Legacy infrastructure retirement (Docker Swarm → K8s, monolith → microservices)
  • Caddy, Consul, or similar service discovery
Tentang Perusahaan
Reku
Financial Services
51 - 200 karyawan

Reku, formerly known as Rekeningku, is a leading Indonesia crypto exchange that provides our users with a powerful and trusted platform to invest, buy, and sell many crypto assets.

Reku was founded in 2018 by the founders and was and remains a pioneer in crypto, providing a curated list of quality coins and tokens, 24/7 live chat by customer support, and the fastest as well as cheapest way to invest in crypto.

Alamat kantor

Equity Tower 11th Floor, Suite E - SCBD Jakarta, RT.5/RW.3, Senayan, Kec. Kby. Baru, Kota Jakarta Selatan, Daerah Khusus Ibukota Jakarta 12190

Galeri Perusahaan

Tips Aman Cari Kerja

Pemberi kerja yang benar tidak akan meminta akun Telegram, top-ups atau pembayaran dalam bentuk apapun. Jangan berikan kontak pribadi, informasi bank, maupun kartu kredit kamu.

Pelajari Selengkapnya

Lowongan Lainnya Untukmu

Senior Tech Lead- DevOps

Gaji Tidak Ditampilkan
Penuh Waktu
5–10 tahun
Minimal Sarjana (S1)
PT Nusatalenta Indonesia
Penuh Waktu
5–10 tahun
Minimal Diploma (D1 - D4)
MSBU

DevOps Engineer

Rp 15 jt-18 jt
Remote/Dari rumah
Penuh Waktu
3–5 tahun
+1
Pt Solusi Teknologi Niaga

Senior Site Reliability Engineer