All Paths
⚙️MLOps EngineerIntermediate

MLOps Engineer Path

Learn to build, deploy, and operate machine learning systems at scale. Master the complete ML lifecycle from experiment tracking to production monitoring. Based on practices from top ML teams at Google, Meta, and Netflix.

14 weeks
10 milestones
0 items

Skills You Will Gain

Experiment TrackingModel RegistryML PipelinesModel ServingFeature StoresData VersioningML MonitoringKubernetesDockerCI/CD for ML

Prerequisites

  • Python programming proficiency
  • Basic ML/DL knowledge
  • Linux/Unix command line
  • Basic DevOps concepts (CI/CD, containers)
  • SQL fundamentals

Learning Milestones

1

MLOps Foundations

Understand the MLOps landscape, its importance, and core principles.

~10h0 items

Learning Objectives

  • Understand ML lifecycle and its challenges in production
  • Learn MLOps maturity levels (0-4)
  • Compare MLOps vs DevOps vs DataOps
  • Identify key MLOps tools and their categories
  • Understand technical debt in ML systems
  • Learn about ML system design patterns
Content coming soon
2

Containerization & Orchestration

Master Docker and Kubernetes for ML workloads.

~18h0 items

Learning Objectives

  • Build optimized Docker images for ML applications
  • Use multi-stage builds for smaller images
  • Manage GPU-enabled containers
  • Deploy ML workloads on Kubernetes
  • Use Helm charts for ML deployments
  • Implement auto-scaling for inference services
Content coming soon
3

Experiment Tracking & Reproducibility

Set up experiment tracking and ensure ML reproducibility.

~15h0 items

Learning Objectives

  • Use MLflow for experiment tracking
  • Implement Weights & Biases for team collaboration
  • Version datasets with DVC (Data Version Control)
  • Create reproducible ML experiments
  • Track hyperparameters, metrics, and artifacts
  • Compare and visualize experiment results
Content coming soon
4

Feature Engineering & Feature Stores

Build feature pipelines and implement feature stores for ML.

~18h0 items

Learning Objectives

  • Design feature engineering pipelines
  • Implement feature stores (Feast, Tecton)
  • Handle online vs offline feature serving
  • Build real-time feature computation
  • Manage feature versioning and lineage
  • Implement feature monitoring
Content coming soon
5

ML Pipelines & Orchestration

Build automated ML pipelines with modern orchestration tools.

~20h0 items

Learning Objectives

  • Design DAG-based ML pipelines
  • Use Kubeflow Pipelines for end-to-end ML
  • Implement pipelines with Apache Airflow
  • Build pipelines with Prefect or Dagster
  • Handle pipeline failures and retries
  • Implement pipeline testing and validation
Content coming soon
6

Model Registry & Versioning

Implement model versioning, registry, and governance.

~12h0 items

Learning Objectives

  • Set up MLflow Model Registry
  • Implement model versioning strategies
  • Design model promotion workflows (staging → production)
  • Track model lineage and metadata
  • Implement model governance policies
  • Handle A/B testing for model rollouts
Content coming soon
7

Model Serving & Inference

Deploy models for both batch and real-time inference at scale.

~20h0 items

Learning Objectives

  • Deploy with TensorFlow Serving and TorchServe
  • Use Triton Inference Server for multi-framework serving
  • Implement batch inference pipelines
  • Build real-time inference APIs with FastAPI
  • Optimize model serving with ONNX
  • Implement model ensembles and cascading
Content coming soon
8

CI/CD for Machine Learning

Build CI/CD pipelines specifically designed for ML workflows.

~15h0 items

Learning Objectives

  • Design ML-specific CI/CD pipelines
  • Implement automated testing for ML code
  • Build data validation in CI/CD
  • Automate model training and evaluation
  • Implement canary deployments for models
  • Use GitHub Actions/GitLab CI for MLOps
Content coming soon
9

ML Monitoring & Observability

Monitor ML systems in production and detect model degradation.

~18h0 items

Learning Objectives

  • Set up Prometheus and Grafana for ML metrics
  • Detect data drift and concept drift
  • Monitor model performance degradation
  • Implement alerting for ML systems
  • Build dashboards for ML observability
  • Use Evidently AI for ML monitoring
Content coming soon
10

LLMOps & GenAI Operations

Apply MLOps principles to LLM and generative AI systems.

~15h0 items

Learning Objectives

  • Deploy and manage LLM inference at scale
  • Implement prompt versioning and management
  • Monitor LLM quality and safety
  • Handle RAG system operations
  • Manage fine-tuning workflows
  • Implement LLM cost optimization
Content coming soon

Content Summary

0
Concepts
0
Papers
0
Lectures
0
Problems