Ir al contenidoIr al pie de página
  • Empleos
  • Empresas
  • Sueldos
  • Para empleadores

      Impulsa tu carrera profesional

      Averigua cuánto podrías ganar, encuentra el empleo perfecto y comparte información sobre tu vida laboral y personal de forma anónima.

      employer cover photo
      employer logo
      employer logo

      EPAM Systems

      Empleador activo

      Información
      Evaluaciones
      Pago y prestaciones
      Empleos
      Entrevistas
      Entrevistas
      Búsquedas relacionadas: Evaluaciones de EPAM Systems | Empleos en EPAM Systems | Sueldos en EPAM Systems | Prestaciones en EPAM Systems
      Entrevistas en EPAM SystemsEntrevistas para el cargo de MLOP Engineer en EPAM SystemsEntrevista en EPAM Systems


      Glassdoor

      • Acerca de
      • Premios
      • Blog
      • Contacto

      Empleadores

      • Cuenta de empleador gratuita
      • Centro de empleador

      Información

      • Ayuda
      • Pautas
      • Condiciones de uso
      • Privacidad y opciones de anuncios
      • No vender ni compartir mi información
      • Herramienta de autorización de cookies

      Trabaja con nosotros

      • Anunciantes
      • Oportunidades laborales
      Descargar aplicación

      • Buscar por:
      • Empresas
      • Empleos
      • Ubicaciones

      Copyright © 2008-2026. Glassdoor LLC. "Glassdoor", "Worklife Pro", "Bowls" y sus logotipos son marcas comerciales registradas de Glassdoor LLC.

      Empresas seguidas

      Sigue a tus empresas favoritas para estar al tanto de las últimas oportunidades y disponer de información desde adentro.

      Búsquedas de empleo

      Recibe recomendaciones y actualizaciones personalizadas al iniciar tu búsqueda.

      Entrevista para MLOP Engineer

      3 jun 2026
      Candidato de entrevista anónimo
      Hyderabad
      Sin ofertas
      Experiencia positiva
      Entrevista difícil

      Solicitud

      Acudí a una entrevista en EPAM Systems (Hyderabad)

      Entrevista

      2. What is MLOps? Answer MLOps is the practice of applying DevOps principles to Machine Learning systems. It covers: Data Management Model Development Model Versioning Deployment Monitoring Retraining Lifecycle Data Collection ↓ Data Validation ↓ Feature Engineering ↓ Model Training ↓ Model Validation ↓ Deployment ↓ Monitoring ↓ Retraining 3. Difference between DevOps and MLOps? DevOps MLOps Focuses on application code Focuses on data + model + code CI/CD CI/CD/CT Version code Version code + data + models Functional testing Model testing Performance monitoring Model drift monitoring 4. What is CI/CD/CT in MLOps? CI Continuous Integration Code Commit ↓ Unit Tests ↓ Build CD Continuous Delivery Build ↓ Deploy CT Continuous Training New Data ↓ Retrain Model ↓ Validate ↓ Deploy 5. How do you version ML models? Tools MLflow DVC S3 Git Example: import mlflow mlflow.sklearn.log_model(model,"customer_churn") Version: v1 v2 v3 6. Explain MLflow Components Tracking Projects Models Registry Example with mlflow.start_run(): mlflow.log_param("lr",0.01) mlflow.log_metric("accuracy",0.95) Interview Follow-up: Why MLflow? Answer: Track experiments, compare runs, register models, and manage deployments. 7. What is Data Drift? Answer Input data distribution changes over time. Example: Training: Age: 20-40 Production: Age: 50-80 Model performance drops. 8. What is Concept Drift? Answer Relationship between features and target changes. Example: Before Covid: Online spending low After Covid: Online spending high Same inputs but different outcomes. 9. How do you detect drift? Methods PSI Population Stability Index KL Divergence Wasserstein Distance KS Test Example: from scipy.stats import ks_2samp ks_2samp(train_data,prod_data) 10. How do you monitor models? Metrics Business Metrics Revenue Conversion CTR Model Metrics Accuracy Precision Recall F1 System Metrics CPU Memory Latency Throughput Tools: Prometheus Grafana ELK 11. Explain Model Retraining Pipeline New Data ↓ Validation ↓ Feature Engineering ↓ Training ↓ Evaluation ↓ Deployment Trigger: Weekly Monthly Drift detection 12. What is Feature Store? Answer Central repository for ML features. Benefits: Reuse features Consistency Online serving Offline training Tools: Feast Tecton 13. Explain Docker in MLOps Dockerfile FROM python:3.11 COPY . /app WORKDIR /app RUN pip install -r requirements.txt CMD ["python","app.py"] Benefits: Portability Reproducibility 14. Difference between Docker and Kubernetes? Docker Kubernetes Containerization Orchestration Single container Multiple containers Packaging Scaling 15. How do you deploy ML models on Kubernetes? Steps Build Docker Image ↓ Push to Registry ↓ Create Deployment ↓ Create Service ↓ Expose API Deployment: apiVersion: apps/v1 kind: Deployment metadata: name: model spec: replicas: 3 16. What is Canary Deployment? Answer Deploy new model to small percentage of users. 90% → Old Model 10% → New Model If successful: 100% New Model 17. Blue-Green Deployment? Answer Blue = Production Green = New Version Switch traffic instantly. Benefits: Zero downtime Easy rollback 18. How would you deploy a model with zero downtime? Answer: Kubernetes Rolling Update Blue-Green Deployment Canary Deployment 19. How do you handle large datasets? Techniques Spark Partitioning Parallel Processing Example: df.repartition(100) 20. What if training data is 1 TB? Answer Never load into memory. Use: Spark Batch Processing Distributed Training 21. What if model training takes 12 hours? Answer Options: Distributed Training GPU Hyperparameter Optimization Incremental Learning 22. Explain Kubernetes HPA Horizontal Pod Autoscaler CPU > 70% Scale: 3 Pods → 10 Pods Example: kubectl autoscale deployment model 23. What happens if a pod crashes? Answer Kubernetes automatically recreates it. Controller: ReplicaSet maintains desired state. 24. How do you secure ML APIs? Methods Authentication JWT OAuth Encryption HTTPS TLS Secrets Kubernetes Secrets AWS Secrets Manager 25. Explain FastAPI deployment from fastapi import FastAPI app = FastAPI() @app.get("/") def predict(): return {"prediction":1} Run: uvicorn app:app 26. What is Model Explainability? Techniques SHAP LIME Feature Importance Example: import shap Shows why prediction happened. 27. Scenario: Accuracy dropped from 95% to 70% Approach Check: Data Drift Concept Drift Data Quality Pipeline Failures Feature Changes Then: Retrain Validate Redeploy 28. Scenario: Prediction API latency increased Investigate CPU Memory Network Database Model Size Optimization: Caching Autoscaling Quantization GPU inference 29. Scenario: Production model gives different results than training Root Causes Feature mismatch Data preprocessing mismatch Version mismatch Missing transformations Solution: Use same pipeline object. 30. Design an End-to-End MLOps Architecture Data Sources ↓ Kafka ↓ Spark ↓ Feature Store ↓ Training Pipeline ↓ MLflow ↓ Model Registry ↓ Docker ↓ Kubernetes ↓ FastAPI ↓ Prometheus/Grafana ↓ Retraining Pipeline Advanced EPAM Follow-up Questions Why use Kubernetes instead of ECS? Multi-cloud support Better ecosystem Advanced autoscaling Service mesh support Why MLflow over DVC? Experiment tracking Model registry Deployment integration How

      Preguntas de entrevista [1]

      Pregunta 1

      2. What is MLOps? Answer MLOps is the practice of applying DevOps principles to Machine Learning systems. It covers: Data Management Model Development Model Versioning Deployment Monitoring Retraining Lifecycle Data Collection ↓ Data Validation ↓ Feature Engineering ↓ Model Training ↓ Model Validation ↓ Deployment ↓ Monitoring ↓ Retraining 3. Difference between DevOps and MLOps? DevOps MLOps Focuses on application code Focuses on data + model + code CI/CD CI/CD/CT Version code Version code + data + models Functional testing Model testing Performance monitoring Model drift monitoring 4. What is CI/CD/CT in MLOps? CI Continuous Integration Code Commit ↓ Unit Tests ↓ Build CD Continuous Delivery Build ↓ Deploy CT Continuous Training New Data ↓ Retrain Model ↓ Validate ↓ Deploy 5. How do you version ML models? Tools MLflow DVC S3 Git Example: import mlflow mlflow.sklearn.log_model(model,"customer_churn") Version: v1 v2 v3 6. Explain MLflow Components Tracking Projects Models Registry Example with mlflow.start_run(): mlflow.log_param("lr",0.01) mlflow.log_metric("accuracy",0.95) Interview Follow-up: Why MLflow? Answer: Track experiments, compare runs, register models, and manage deployments. 7. What is Data Drift? Answer Input data distribution changes over time. Example: Training: Age: 20-40 Production: Age: 50-80 Model performance drops. 8. What is Concept Drift? Answer Relationship between features and target changes. Example: Before Covid: Online spending low After Covid: Online spending high Same inputs but different outcomes. 9. How do you detect drift? Methods PSI Population Stability Index KL Divergence Wasserstein Distance KS Test Example: from scipy.stats import ks_2samp ks_2samp(train_data,prod_data) 10. How do you monitor models? Metrics Business Metrics Revenue Conversion CTR Model Metrics Accuracy Precision Recall F1 System Metrics CPU Memory Latency Throughput Tools: Prometheus Grafana ELK 11. Explain Model Retraining Pipeline New Data ↓ Validation ↓ Feature Engineering ↓ Training ↓ Evaluation ↓ Deployment Trigger: Weekly Monthly Drift detection 12. What is Feature Store? Answer Central repository for ML features. Benefits: Reuse features Consistency Online serving Offline training Tools: Feast Tecton 13. Explain Docker in MLOps Dockerfile FROM python:3.11 COPY . /app WORKDIR /app RUN pip install -r requirements.txt CMD ["python","app.py"] Benefits: Portability Reproducibility 14. Difference between Docker and Kubernetes? Docker Kubernetes Containerization Orchestration Single container Multiple containers Packaging Scaling 15. How do you deploy ML models on Kubernetes? Steps Build Docker Image ↓ Push to Registry ↓ Create Deployment ↓ Create Service ↓ Expose API Deployment: apiVersion: apps/v1 kind: Deployment metadata: name: model spec: replicas: 3 16. What is Canary Deployment? Answer Deploy new model to small percentage of users. 90% → Old Model 10% → New Model If successful: 100% New Model 17. Blue-Green Deployment? Answer Blue = Production Green = New Version Switch traffic instantly. Benefits: Zero downtime Easy rollback 18. How would you deploy a model with zero downtime? Answer: Kubernetes Rolling Update Blue-Green Deployment Canary Deployment 19. How do you handle large datasets? Techniques Spark Partitioning Parallel Processing Example: df.repartition(100) 20. What if training data is 1 TB? Answer Never load into memory. Use: Spark Batch Processing Distributed Training 21. What if model training takes 12 hours? Answer Options: Distributed Training GPU Hyperparameter Optimization Incremental Learning 22. Explain Kubernetes HPA Horizontal Pod Autoscaler CPU > 70% Scale: 3 Pods → 10 Pods Example: kubectl autoscale deployment model 23. What happens if a pod crashes? Answer Kubernetes automatically recreates it. Controller: ReplicaSet maintains desired state. 24. How do you secure ML APIs? Methods Authentication JWT OAuth Encryption HTTPS TLS Secrets Kubernetes Secrets AWS Secrets Manager 25. Explain FastAPI deployment from fastapi import FastAPI app = FastAPI() @app.get("/") def predict(): return {"prediction":1} Run: uvicorn app:app 26. What is Model Explainability? Techniques SHAP LIME Feature Importance Example: import shap Shows why prediction happened. 27. Scenario: Accuracy dropped from 95% to 70% Approach Check: Data Drift Concept Drift Data Quality Pipeline Failures Feature Changes Then: Retrain Validate Redeploy 28. Scenario: Prediction API latency increased Investigate CPU Memory Network Database Model Size Optimization: Caching Autoscaling Quantization GPU inference 29. Scenario: Production model gives different results than training Root Causes Feature mismatch Data preprocessing mismatch Version mismatch Missing transformations Solution: Use same pipeline object. 30. Design an End-to-End MLOps Architecture Data Sources ↓ Kafka ↓ Spark ↓ Feature Store ↓ Training Pipeline ↓ MLflow ↓ Model Registry ↓ Docker ↓ Kubernetes ↓ FastAPI ↓ Prometheus/Grafana ↓ Retraining Pipeline Advanced EPAM Follow-up Questions Why use Kubernetes instead of ECS? Multi-cloud support Better ecosystem Advanced autoscaling Service mesh support Why MLflow over DVC? Experiment tracking Model registry Deployment integration How
      1 respuesta