← All articlesAI & Machine Learning

MLOps Pipeline: From Jupyter Notebook to Production API

Build a complete MLOps pipeline from Jupyter prototyping to a production ML API. Model versioning, testing, Docker deployment, and monitoring included.

Y
Yash Pritwani
14 min read

The Notebook-to-Production Gap

Every ML project starts the same way: a data scientist builds a promising model in a Jupyter notebook. It achieves great metrics. Everyone is excited. Then someone asks, "How do we deploy this?"

RawDataPre-processTrainModelEvaluateMetricsDeployModelMonretrain loop

ML pipeline: from raw data collection through training, evaluation, deployment, and continuous monitoring.

The notebook has hardcoded file paths, global variables, no error handling, and dependencies installed with random pip commands. Turning this into a reliable production service takes more engineering than the model itself. This guide bridges that gap.

The MLOps Pipeline

Notebook → Refactored Code → Tests → Docker → API → Monitoring
    ↑                                                    |
    └──────── Feedback Loop (retrain, improve) ──────────┘

Step 1: Refactor the Notebook into Modules

A typical notebook mess:

Get more insights on AI & Machine Learning

Join 2,000+ engineers who get our weekly deep-dives. No spam, unsubscribe anytime.

# Cell 1 (the notebook version)
import pandas as pd
df = pd.read_csv("/home/user/data/sales.csv")
df = df.dropna()
df['date'] = pd.to_datetime(df['date'])
# ... 50 more lines of preprocessing

Refactored into proper modules:

ml-project/
├── src/
│   ├── __init__.py
│   ├── data.py          # Data loading and preprocessing
│   ├── features.py      # Feature engineering
│   ├── model.py         # Model training and prediction
│   └── config.py        # Configuration
├── tests/
│   ├── test_data.py
│   ├── test_features.py
│   └── test_model.py
├── api/
│   └── main.py          # FastAPI application
├── models/              # Saved model artifacts
├── Dockerfile
├── pyproject.toml
└── README.md
# src/data.py
import pandas as pd
from pathlib import Path

def load_and_clean(data_path: Path) -> pd.DataFrame:
    """Load CSV data and perform basic cleaning."""
    df = pd.read_csv(data_path)
    df = df.dropna(subset=["date", "amount"])
    df["date"] = pd.to_datetime(df["date"])
    return df

def validate_schema(df: pd.DataFrame) -> bool:
    """Validate that required columns exist and have correct types."""
    required = {"date": "datetime64[ns]", "amount": "float64"}
    for col, dtype in required.items():
        if col not in df.columns:
            raise ValueError(f"Missing column: {col}")
    return True
# src/model.py
import joblib
from pathlib import Path
from sklearn.ensemble import GradientBoostingRegressor
import numpy as np

class SalesPredictor:
    def __init__(self, model_path: Path | None = None):
        if model_path and model_path.exists():
            self.model = joblib.load(model_path)
        else:
            self.model = GradientBoostingRegressor(
                n_estimators=200,
                max_depth=5,
                learning_rate=0.1
            )

    def train(self, X: np.ndarray, y: np.ndarray):
        self.model.fit(X, y)
        return self

    def predict(self, X: np.ndarray) -> np.ndarray:
        return self.model.predict(X)

    def save(self, path: Path):
        joblib.dump(self.model, path)

    def metrics(self, X: np.ndarray, y: np.ndarray) -> dict:
        from sklearn.metrics import mean_absolute_error, r2_score
        predictions = self.predict(X)
        return {
            "mae": mean_absolute_error(y, predictions),
            "r2": r2_score(y, predictions)
        }

Step 2: Add Tests

ML code needs tests too. Test data processing, feature engineering, and model behavior:

# tests/test_model.py
import numpy as np
import pytest
from src.model import SalesPredictor

@pytest.fixture
def trained_model():
    model = SalesPredictor()
    X = np.random.rand(100, 5)
    y = X[:, 0] * 2 + X[:, 1] * 3 + np.random.rand(100) * 0.1
    model.train(X, y)
    return model

def test_prediction_shape(trained_model):
    X = np.random.rand(10, 5)
    predictions = trained_model.predict(X)
    assert predictions.shape == (10,)

def test_prediction_range(trained_model):
    X = np.random.rand(10, 5)
    predictions = trained_model.predict(X)
    assert all(p > -100 and p < 100 for p in predictions)

def test_model_save_load(trained_model, tmp_path):
    model_path = tmp_path / "model.joblib"
    trained_model.save(model_path)
    loaded = SalesPredictor(model_path)
    X = np.random.rand(5, 5)
    np.testing.assert_array_almost_equal(
        trained_model.predict(X),
        loaded.predict(X)
    )
InputHiddenHiddenOutput

Neural network architecture: data flows through input, hidden, and output layers.

Step 3: Build the API

# api/main.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel, field_validator
import numpy as np
from pathlib import Path
from src.model import SalesPredictor

app = FastAPI(title="Sales Prediction API", version="1.0.0")

MODEL_PATH = Path("models/sales_model.joblib")
predictor = SalesPredictor(MODEL_PATH)

class PredictionRequest(BaseModel):
    features: list[float]

    @field_validator("features")
    @classmethod
    def validate_features(cls, v):
        if len(v) != 5:
            raise ValueError("Expected 5 features")
        return v

class PredictionResponse(BaseModel):
    prediction: float
    model_version: str

@app.post("/predict", response_model=PredictionResponse)
def predict(request: PredictionRequest):
    try:
        X = np.array([request.features])
        prediction = predictor.predict(X)[0]
        return PredictionResponse(
            prediction=round(float(prediction), 2),
            model_version="1.0.0"
        )
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

@app.get("/health")
def health():
    return {"status": "healthy", "model_loaded": predictor.model is not None}

Step 4: Dockerize

FROM python:3.12-slim

WORKDIR /app
COPY pyproject.toml .
RUN pip install --no-cache-dir .

COPY src/ src/
COPY api/ api/
COPY models/ models/

RUN useradd --create-home appuser
USER appuser

EXPOSE 8000
HEALTHCHECK --interval=30s --timeout=5s \
  CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')"

CMD ["uvicorn", "api.main:app", "--host", "0.0.0.0", "--port", "8000"]

Step 5: Model Versioning

Never overwrite models. Version everything:

Free Resource

Free Cloud Architecture Checklist

A 47-point checklist covering security, scalability, cost optimization, and disaster recovery for production cloud environments.

Download the Checklist
# src/versioning.py
import hashlib
import json
from datetime import datetime
from pathlib import Path

def save_versioned_model(model, metrics: dict, model_dir: Path):
    """Save model with version metadata."""
    timestamp = datetime.utcnow().strftime("%Y%m%d_%H%M%S")
    version_dir = model_dir / timestamp

    version_dir.mkdir(parents=True, exist_ok=True)
    model.save(version_dir / "model.joblib")

    metadata = {
        "version": timestamp,
        "metrics": metrics,
        "created_at": datetime.utcnow().isoformat(),
    }
    (version_dir / "metadata.json").write_text(json.dumps(metadata, indent=2))

    # Update symlink to latest
    latest = model_dir / "latest"
    if latest.is_symlink():
        latest.unlink()
    latest.symlink_to(version_dir)

    return timestamp

Step 6: Monitoring in Production

Track model performance to detect drift:

# api/monitoring.py
import time
from prometheus_client import Histogram, Counter, Gauge

PREDICTION_LATENCY = Histogram(
    "prediction_latency_seconds",
    "Time to generate a prediction"
)

PREDICTION_COUNT = Counter(
    "predictions_total",
    "Total predictions made",
    ["model_version"]
)

PREDICTION_VALUE = Histogram(
    "prediction_value",
    "Distribution of predicted values",
    buckets=[0, 10, 50, 100, 500, 1000, 5000]
)

# In your predict endpoint:
@app.post("/predict")
def predict(request: PredictionRequest):
    start = time.time()
    prediction = predictor.predict(...)
    PREDICTION_LATENCY.observe(time.time() - start)
    PREDICTION_COUNT.labels(model_version="1.0.0").inc()
    PREDICTION_VALUE.observe(prediction)
    return {"prediction": prediction}

Visualize in Grafana: prediction latency, throughput, value distribution. When the distribution shifts significantly from training data, it is time to retrain.

CodeBuildTestDeployLiveContinuous Integration / Continuous Deployment Pipeline

A typical CI/CD pipeline: code flows through build, test, and deploy stages automatically.

The Complete CI/CD Pipeline

# .gitea/workflows/ml-pipeline.yml
name: ML Pipeline
on:
  push:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: '3.12'
      - run: pip install -e ".[test]"
      - run: pytest tests/ -v --tb=short

  build:
    needs: test
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: docker build -t ml-api:$GITHUB_SHA .
      - run: docker push registry.techsaas.cloud/ml-api:$GITHUB_SHA

  deploy:
    needs: build
    runs-on: ubuntu-latest
    steps:
      - run: |
          ssh deploy@prod "docker pull registry.techsaas.cloud/ml-api:$GITHUB_SHA && docker compose up -d"

The gap between a Jupyter notebook and a production ML service is real, but it is bridgeable with good engineering practices. At TechSaaS, we help teams build these pipelines so data scientists can focus on models while the infrastructure handles everything else.

#mlops#machine-learning#api#docker#model-serving#fastapi

Related Service

Cloud Solutions

Let our experts help you build the right technology strategy for your business.

Need help with ai & machine learning?

TechSaaS provides expert consulting and managed services for cloud infrastructure, DevOps, and AI/ML operations.

We Will Build You a Demo Site — For Free

Like it? Pay us. Do not like it? Walk away, zero complaints. You will spend way less than hiring developers or any agency.

47+ companies trusted us
99.99% uptime
< 48hr response

No spam. No contracts. Just a free demo.