Deskripsi pekerjaan Sr. Machine Learning Engineer Glints Taploker
We're looking for a Senior ML Engineer to build real-time ML systems for mobile advertising platform. You'll own the full lifecycle - feature engineering, model training, deployment, and production monitoring.
What You'll Build:
ML Serving Infrastructure
- Real-time, low-latency model serving
- Deep learning (PyTorch) and tree-based models (LightGBM/XGBoost) behind FastAPI on AWS ECS
MLOps Pipelines
- Model promotion workflows: local → staging → production
- Automated validation gates before promotion (offline eval metrics, data quality checks)
- Retraining pipelines via Databricks Workflows (scheduled + drift-triggered)
- CI/CD for both code and model artifacts
- Blue-green or canary deployments for safe rollouts
Monitoring & Observability
- Model quality tracking: accuracy, precision, recall, AUC over time
- Data drift and feature drift detection with automated alerts
- Feature freshness monitoring (detecting feature store lag before it impacts predictions)
- Inference latency: p50, p95, p99 via Grafana dashboards
- Sentry for error tracking, CloudWatch + Prometheus for infra metrics
- A/B testing infrastructure for model experiments in production
You will:
- Design low-latency model serving architectures
- Maintain reusable feature pipelines
- Maintain ML models with reliable CI/CD pipelines
- Improve model performance, scalability, and reliability
- Collaborate with data scientists, data engineers, and product teams
Requirements:
- 5+ years Python, 3+ years ML Engineering
- Strong MLFlow experience (Feature Store, MLflow, Unity Catalog)
- AWS production experience (ECS, S3, CloudWatch)
- Built real-time ML systems with latency requirements
Nice to Have:
- AdTech, fraud detection, or recommendation systems experience
- Kafka, Redis, streaming pipelines
- Grafana, Prometheus, Sentry
Tech Stack: Databricks, PySpark, PyTorch, LightGBM, FastAPI, AWS ECS, Redis, Kafka, MLflow, Grafana, Prometheus, Sentry



