MLOps Pipeline: From Jupyter Notebook to Production API
Build a complete MLOps pipeline from Jupyter prototyping to a production ML API. Model versioning, testing, Docker deployment, and monitoring included.
The Notebook-to-Production Gap
Every ML project starts the same way: a data scientist builds a promising model in a Jupyter notebook. It achieves great metrics. Everyone is excited. Then someone asks, "How do we deploy this?"
ML pipeline: from raw data collection through training, evaluation, deployment, and continuous monitoring.
The notebook has hardcoded file paths, global variables, no error handling, and dependencies installed with random pip commands. Turning this into a reliable production service takes more engineering than the model itself. This guide bridges that gap.
The MLOps Pipeline
Notebook → Refactored Code → Tests → Docker → API → Monitoring
↑ |
└──────── Feedback Loop (retrain, improve) ──────────┘
Step 1: Refactor the Notebook into Modules
A typical notebook mess:
Get more insights on AI & Machine Learning
Join 2,000+ engineers who get our weekly deep-dives. No spam, unsubscribe anytime.
# Cell 1 (the notebook version)
import pandas as pd
df = pd.read_csv("/home/user/data/sales.csv")
df = df.dropna()
df['date'] = pd.to_datetime(df['date'])
# ... 50 more lines of preprocessing
Refactored into proper modules:
ml-project/
├── src/
│ ├── __init__.py
│ ├── data.py # Data loading and preprocessing
│ ├── features.py # Feature engineering
│ ├── model.py # Model training and prediction
│ └── config.py # Configuration
├── tests/
│ ├── test_data.py
│ ├── test_features.py
│ └── test_model.py
├── api/
│ └── main.py # FastAPI application
├── models/ # Saved model artifacts
├── Dockerfile
├── pyproject.toml
└── README.md
# src/data.py
import pandas as pd
from pathlib import Path
def load_and_clean(data_path: Path) -> pd.DataFrame:
"""Load CSV data and perform basic cleaning."""
df = pd.read_csv(data_path)
df = df.dropna(subset=["date", "amount"])
df["date"] = pd.to_datetime(df["date"])
return df
def validate_schema(df: pd.DataFrame) -> bool:
"""Validate that required columns exist and have correct types."""
required = {"date": "datetime64[ns]", "amount": "float64"}
for col, dtype in required.items():
if col not in df.columns:
raise ValueError(f"Missing column: {col}")
return True
# src/model.py
import joblib
from pathlib import Path
from sklearn.ensemble import GradientBoostingRegressor
import numpy as np
class SalesPredictor:
def __init__(self, model_path: Path | None = None):
if model_path and model_path.exists():
self.model = joblib.load(model_path)
else:
self.model = GradientBoostingRegressor(
n_estimators=200,
max_depth=5,
learning_rate=0.1
)
def train(self, X: np.ndarray, y: np.ndarray):
self.model.fit(X, y)
return self
def predict(self, X: np.ndarray) -> np.ndarray:
return self.model.predict(X)
def save(self, path: Path):
joblib.dump(self.model, path)
def metrics(self, X: np.ndarray, y: np.ndarray) -> dict:
from sklearn.metrics import mean_absolute_error, r2_score
predictions = self.predict(X)
return {
"mae": mean_absolute_error(y, predictions),
"r2": r2_score(y, predictions)
}
Step 2: Add Tests
ML code needs tests too. Test data processing, feature engineering, and model behavior:
# tests/test_model.py
import numpy as np
import pytest
from src.model import SalesPredictor
@pytest.fixture
def trained_model():
model = SalesPredictor()
X = np.random.rand(100, 5)
y = X[:, 0] * 2 + X[:, 1] * 3 + np.random.rand(100) * 0.1
model.train(X, y)
return model
def test_prediction_shape(trained_model):
X = np.random.rand(10, 5)
predictions = trained_model.predict(X)
assert predictions.shape == (10,)
def test_prediction_range(trained_model):
X = np.random.rand(10, 5)
predictions = trained_model.predict(X)
assert all(p > -100 and p < 100 for p in predictions)
def test_model_save_load(trained_model, tmp_path):
model_path = tmp_path / "model.joblib"
trained_model.save(model_path)
loaded = SalesPredictor(model_path)
X = np.random.rand(5, 5)
np.testing.assert_array_almost_equal(
trained_model.predict(X),
loaded.predict(X)
)
Neural network architecture: data flows through input, hidden, and output layers.
Step 3: Build the API
# api/main.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel, field_validator
import numpy as np
from pathlib import Path
from src.model import SalesPredictor
app = FastAPI(title="Sales Prediction API", version="1.0.0")
MODEL_PATH = Path("models/sales_model.joblib")
predictor = SalesPredictor(MODEL_PATH)
class PredictionRequest(BaseModel):
features: list[float]
@field_validator("features")
@classmethod
def validate_features(cls, v):
if len(v) != 5:
raise ValueError("Expected 5 features")
return v
class PredictionResponse(BaseModel):
prediction: float
model_version: str
@app.post("/predict", response_model=PredictionResponse)
def predict(request: PredictionRequest):
try:
X = np.array([request.features])
prediction = predictor.predict(X)[0]
return PredictionResponse(
prediction=round(float(prediction), 2),
model_version="1.0.0"
)
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
@app.get("/health")
def health():
return {"status": "healthy", "model_loaded": predictor.model is not None}
Step 4: Dockerize
FROM python:3.12-slim
WORKDIR /app
COPY pyproject.toml .
RUN pip install --no-cache-dir .
COPY src/ src/
COPY api/ api/
COPY models/ models/
RUN useradd --create-home appuser
USER appuser
EXPOSE 8000
HEALTHCHECK --interval=30s --timeout=5s \
CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')"
CMD ["uvicorn", "api.main:app", "--host", "0.0.0.0", "--port", "8000"]
Step 5: Model Versioning
Never overwrite models. Version everything:
Free Resource
Free Cloud Architecture Checklist
A 47-point checklist covering security, scalability, cost optimization, and disaster recovery for production cloud environments.
# src/versioning.py
import hashlib
import json
from datetime import datetime
from pathlib import Path
def save_versioned_model(model, metrics: dict, model_dir: Path):
"""Save model with version metadata."""
timestamp = datetime.utcnow().strftime("%Y%m%d_%H%M%S")
version_dir = model_dir / timestamp
version_dir.mkdir(parents=True, exist_ok=True)
model.save(version_dir / "model.joblib")
metadata = {
"version": timestamp,
"metrics": metrics,
"created_at": datetime.utcnow().isoformat(),
}
(version_dir / "metadata.json").write_text(json.dumps(metadata, indent=2))
# Update symlink to latest
latest = model_dir / "latest"
if latest.is_symlink():
latest.unlink()
latest.symlink_to(version_dir)
return timestamp
Step 6: Monitoring in Production
Track model performance to detect drift:
# api/monitoring.py
import time
from prometheus_client import Histogram, Counter, Gauge
PREDICTION_LATENCY = Histogram(
"prediction_latency_seconds",
"Time to generate a prediction"
)
PREDICTION_COUNT = Counter(
"predictions_total",
"Total predictions made",
["model_version"]
)
PREDICTION_VALUE = Histogram(
"prediction_value",
"Distribution of predicted values",
buckets=[0, 10, 50, 100, 500, 1000, 5000]
)
# In your predict endpoint:
@app.post("/predict")
def predict(request: PredictionRequest):
start = time.time()
prediction = predictor.predict(...)
PREDICTION_LATENCY.observe(time.time() - start)
PREDICTION_COUNT.labels(model_version="1.0.0").inc()
PREDICTION_VALUE.observe(prediction)
return {"prediction": prediction}
Visualize in Grafana: prediction latency, throughput, value distribution. When the distribution shifts significantly from training data, it is time to retrain.
A typical CI/CD pipeline: code flows through build, test, and deploy stages automatically.
The Complete CI/CD Pipeline
# .gitea/workflows/ml-pipeline.yml
name: ML Pipeline
on:
push:
branches: [main]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.12'
- run: pip install -e ".[test]"
- run: pytest tests/ -v --tb=short
build:
needs: test
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: docker build -t ml-api:$GITHUB_SHA .
- run: docker push registry.techsaas.cloud/ml-api:$GITHUB_SHA
deploy:
needs: build
runs-on: ubuntu-latest
steps:
- run: |
ssh deploy@prod "docker pull registry.techsaas.cloud/ml-api:$GITHUB_SHA && docker compose up -d"
The gap between a Jupyter notebook and a production ML service is real, but it is bridgeable with good engineering practices. At TechSaaS, we help teams build these pipelines so data scientists can focus on models while the infrastructure handles everything else.
Related Service
Cloud Solutions
Let our experts help you build the right technology strategy for your business.
Need help with ai & machine learning?
TechSaaS provides expert consulting and managed services for cloud infrastructure, DevOps, and AI/ML operations.
We Will Build You a Demo Site — For Free
Like it? Pay us. Do not like it? Walk away, zero complaints. You will spend way less than hiring developers or any agency.
No spam. No contracts. Just a free demo.