📖 Basics of MLOps¶

🧠 What is MLOps?
🔁 MLOps vs DevOps vs DataOps
❓ Why MLOps Matters
🌀 MLOps Lifecycle
🏗️ Stages of MLOps
👥 Who Uses MLOps?
🧰 Common Tools by Stage
📶 MLOps Maturity Levels
🧑‍💻 How Much Should a Data Scientist Know?
✅ Summary

🧠 What is MLOps?¶

📌 Definition¶

What is MLOps?¶

MLOps (Machine Learning Operations) is the practice of applying DevOps principles to the machine learning lifecycle — combining collaboration, automation, and governance to reliably take ML models from experimentation to production.
It focuses on making ML workflows repeatable, auditable, and scalable, especially in production environments.

🧭 Scope of MLOps¶

What does MLOps cover?¶

MLOps spans the entire machine learning lifecycle, not just deployment:

Data: ingestion, validation, versioning
Modeling: training, tuning, experiment tracking
Packaging: reproducible scripts, containerization
Deployment: serving models as APIs or batch jobs
Monitoring: detecting data drift, model decay, logging behavior
Feedback Loops: closing the loop from real-world usage to retraining

MLOps is not limited to infra teams — data scientists, ML engineers, and platform teams all play a role.

Back to the top

🔁 MLOps vs DevOps vs DataOps¶

🪞 Key Differences¶

Comparing MLOps, DevOps, and DataOps¶

Aspect	DevOps	DataOps	MLOps
Focus	Software delivery & CI/CD	Data pipelines & data quality	ML model lifecycle & reliability
Primary Unit	Code (apps, services)	Data (ETL, validation, lineage)	Models (training, deployment, monitoring)
Key Roles	Developers, DevOps engineers	Data engineers, analysts	Data scientists, ML engineers
Tooling	GitHub Actions, Jenkins, Docker	Airflow, dbt, Great Expectations	MLflow, TFX, SageMaker, KubeFlow

Each discipline solves a different bottleneck, but they all aim for automation, reproducibility, and scale in production systems.

🔄 Overlaps & Confusions¶

Why These Terms Often Get Mixed Up¶

DevOps and MLOps both use CI/CD, version control, and containerization — but MLOps adds data + model tracking.
MLOps and DataOps both rely on good data practices — but DataOps stops short of modeling or deployment.
In practice, MLOps sits at the intersection of both: you need DevOps tooling and DataOps discipline to do MLOps right.

Back to the top

❓ Why MLOps Matters¶

🧨 Real Pain Points¶

Problems That Arise Without MLOps¶

No reproducibility: Results from notebooks can't be replicated in production.
Ad hoc handoffs: Models are thrown over the wall to engineers with minimal documentation.
Silent model decay: Performance drops over time due to changing data, but no one notices.
Versioning chaos: Confusion over which model or dataset was used in production.
Environment mismatches: Code works locally but fails in staging or production due to dependency drift.

💥 Failures Without MLOps¶

Realistic Consequences¶

A model trained on clean, idealized data fails miserably when exposed to real user behavior.
A one-off script introduces a bug during retraining, and no one notices for weeks.
A business team makes decisions based on stale model predictions without knowing the model is outdated.
Multiple teams unknowingly deploy slightly different versions of the “same” model.

These failures aren’t rare — they’re default behavior without MLOps practices in place.

Back to the top

🌀 MLOps Lifecycle¶

🧱 High-Level Stages¶

The Core Lifecycle of MLOps¶

Data Collection & Validation
- Source data is ingested, cleaned, profiled, and versioned.
Model Development
- Includes training, tuning, and evaluating models — often in notebooks or scripts.
Packaging & Testing
- Code and models are serialized, containerized, and tested for portability.
Deployment
- Models are exposed via APIs, batch jobs, or integrated into apps.
Monitoring
- Track predictions, usage metrics, model drift, and data quality.
Governance & Compliance
- Logging, reproducibility, access control, and auditability across stages.

🔄 Feedback Loops¶

From Real World → Retraining¶

Monitoring feeds retraining: Real-time logs, drift metrics, and model errors inform the next training cycle.
Closed-loop learning: Label feedback, human-in-the-loop corrections, or delayed outcomes are folded into future model updates.
Versioned iterations: Each training cycle builds on a known version of code, data, and parameters.

MLOps ensures this loop is automated and reliable, not manual or reactive.

Back to the top

🏗️ Stages of MLOps¶

📦 Data Management¶

What MLOps Brings to Data¶

Data versioning (e.g., DVC, Delta Lake) to track exactly what was used.
Validation checks to catch schema drift or missing values before training.
Lineage tracking to trace outputs back to raw sources.

🧠 Model Training¶

Structured, Repeatable Training¶

Encapsulation of training logic in scripts or pipelines.
Seed control and logging for reproducibility.
Automation of hyperparameter tuning and cross-validation.

🔍 Experiment Tracking¶

Visibility Into What Was Tried¶

Logging of parameters, metrics, artifacts (e.g., with MLflow, Weights & Biases).
Clear audit trail of what worked and what didn’t.
Ability to compare runs and rerun past configurations.

🚀 Deployment¶

Taking Models to Production¶

Model packaging into APIs or batch jobs.
Environment isolation via Docker or Conda.
Integration with CI/CD pipelines for safe rollouts.

📉 Monitoring¶

Post-Deployment Health Checks¶

Live tracking of prediction volume, latency, errors.
Data drift detection to identify shifts in inputs.
Alerting when performance thresholds are breached.

Back to the top

👥 Who Uses MLOps?¶

🧑‍🔬 Personas¶

Key Roles Involved in MLOps¶

Data Scientist: Develops models, interprets results, defines metrics.
ML Engineer: Productionizes models, builds pipelines, handles scalability.
Data Engineer: Manages data pipelines, transformations, and storage.
DevOps/Infra Engineer: Maintains infrastructure, automates CI/CD, ensures reliability.
Product Owner / Analyst: Provides context, evaluates outcomes, communicates needs.

MLOps is cross-functional by design — no single person owns it end-to-end.

🧭 Responsibility Split¶

Who Owns What?¶

Stage	Primary Owner(s)
Data Ingestion	Data Engineer
Feature Engineering	Data Scientist, Data Engineer
Model Training	Data Scientist
Experiment Tracking	Data Scientist, ML Engineer
Deployment	ML Engineer, DevOps
Monitoring	ML Engineer, Infra/DevOps
Feedback Loop	Data Scientist, Product, ML Engineer

Responsibility often overlaps — coordination is key to avoid blind spots.

Back to the top

🧰 Common Tools by Stage¶

🧪 Training + Tracking¶

Tools That Support Model Development¶

MLflow – Tracks experiments, parameters, artifacts, and metrics.
Weights & Biases (W&B) – Visual experiment dashboard, collaboration features.
DVC – Version control for datasets and model artifacts.
Optuna / Ray Tune – Hyperparameter tuning frameworks.

🚀 Deployment¶

Tools for Serving Models¶

FastAPI / Flask – Lightweight REST API frameworks.
Docker – Containerization for consistent deployment.
KServe / TorchServe / TF Serving – Model-specific serving stacks.
SageMaker / Vertex AI / Azure ML – Cloud-hosted deployment solutions.

📊 Monitoring¶

Tools for Post-Deployment Visibility¶

Prometheus + Grafana – Collect and visualize performance metrics.
WhyLabs / Evidently – Detect drift, monitor data and model health.
ELK Stack (Elasticsearch, Logstash, Kibana) – Log aggregation and search.
Custom Dashboards – Streamlit, Superset, or internal tools for surfacing signals.

Back to the top

📶 MLOps Maturity Levels¶

🔢 Level 0 – Manual¶

No MLOps Practices in Place¶

Models are trained in notebooks, saved manually.
No versioning for data, code, or models.
Deployment is manual or non-existent.
No monitoring or feedback loops — “train once and forget.”

This is common in early-stage or research settings.

🔢 Level 1 – Partial Automation¶

Some Repeatability and Tracking¶

Training moved to scripts with basic automation.
Experiments tracked using MLflow or W&B.
Models deployed with basic CI/CD or manual API wrapping.
Some monitoring exists, but limited in scope.

Teams at this stage are usually growing or scaling up.

🔢 Level 2 – Full Automation¶

End-to-End MLOps Pipeline¶

Data, code, and model artifacts are fully versioned.
CI/CD automates training, testing, and deployment.
Monitoring alerts trigger retraining or rollback.
Model registry, approval flows, and rollback strategies in place.

This level is typical for mature ML product teams or platformized orgs.

Back to the top

🧑‍💻 How Much Should a Data Scientist Know?¶

⚖️ Breadth vs Depth¶

You Don’t Need to Be an MLE¶

A data scientist doesn’t need to master infra or deployment tools, but should:

Understand the full ML lifecycle — not just modeling.
Collaborate effectively with engineers by knowing the basics of CI/CD, Docker, and model serving.
Speak the language of operations and recognize production constraints.

🧱 Minimum Baseline¶

What You Should Know to Be Dangerous¶

How your model will be used: batch vs real-time, latency limits, API vs offline job.
What can break: data drift, retraining issues, environment mismatches.
How to track your work: MLflow, version control, reproducible scripts.
How to communicate handoffs: clear documentation, artifacts, config files.

You don’t need to build the MLOps stack — but you should build for it.

Back to the top

✅ Summary¶

🔁 Key Takeaways¶

MLOps is not just deployment — it covers the full ML lifecycle from data to feedback loops.
It brings structure, repeatability, and monitoring to what is often an ad hoc process.
MLOps is collaborative — no single person owns it end-to-end.
You don’t need to be an infra expert, but you should design with production in mind.

📍 What's Next¶

In the next notebook (02_Model_Packaging.ipynb), we’ll move from theory to practice:

Structure your ML projects for reuse.
Save models in a portable format.
Create training scripts and basic CLI interfaces.
Learn the basics of containerization with Docker.

This begins the transition from experimentation → deployable asset.

Back to the top