Monitoring Docker Containers with cAdvisor and Prometheus

Set up comprehensive Docker container monitoring with cAdvisor for metrics collection, Prometheus for storage and alerting, and Grafana for visualization....

Y
Yash Pritwani
13 min read

Why Container Monitoring Matters

Running Docker containers without monitoring is like driving without a dashboard. You need to know CPU usage, memory consumption, network I/O, and disk usage for every container — in real-time and historically.

OrchestratorNode 1Container AContainer BNode 2Container CContainer ANode 3Container BContainer D

Container orchestration distributes workloads across multiple nodes for resilience and scale.

The Monitoring Stack

Our monitoring stack consists of three components:

  • cAdvisor: Collects container metrics (built by Google)
  • Prometheus: Stores metrics and evaluates alerting rules
  • Grafana: Visualizes everything in beautiful dashboards

Docker Compose Setup

version: "3.8"
services:
  cadvisor:
    image: gcr.io/cadvisor/cadvisor:latest
    container_name: cadvisor
    privileged: true
    devices:
      - /dev/kmsg:/dev/kmsg
    volumes:
      - /:/rootfs:ro
      - /var/run:/var/run:ro
      - /sys:/sys:ro
      - /var/lib/docker/:/var/lib/docker:ro
      - /dev/disk/:/dev/disk:ro
    ports:
      - "8080:8080"
    restart: unless-stopped

  prometheus:
    image: prom/prometheus:latest
    container_name: prometheus
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--storage.tsdb.retention.time=30d'
      - '--web.enable-lifecycle'
    volumes:
      - ./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
      - ./prometheus/alert-rules.yml:/etc/prometheus/alert-rules.yml
      - prometheus_data:/prometheus
    ports:
      - "9090:9090"
    restart: unless-stopped

  grafana:
    image: grafana/grafana:latest
    container_name: grafana
    environment:
      GF_SECURITY_ADMIN_USER: admin
      GF_SECURITY_ADMIN_PASSWORD: your-grafana-password
      GF_SERVER_ROOT_URL: https://monitoring.example.com
    volumes:
      - grafana_data:/var/lib/grafana
      - ./grafana/provisioning:/etc/grafana/provisioning
    ports:
      - "3000:3000"
    restart: unless-stopped

volumes:
  prometheus_data:
  grafana_data:

Get more insights on DevOps

Join 2,000+ engineers who get our weekly deep-dives. No spam, unsubscribe anytime.

Prometheus Configuration

# prometheus/prometheus.yml
global:
  scrape_interval: 15s
  evaluation_interval: 15s

rule_files:
  - "alert-rules.yml"

scrape_configs:
  - job_name: "cadvisor"
    static_configs:
      - targets: ["cadvisor:8080"]

  - job_name: "prometheus"
    static_configs:
      - targets: ["localhost:9090"]

  - job_name: "node-exporter"
    static_configs:
      - targets: ["node-exporter:9100"]

  - job_name: "docker"
    static_configs:
      - targets: ["host.docker.internal:9323"]

Alert Rules

# prometheus/alert-rules.yml
groups:
  - name: container-alerts
    rules:
      - alert: ContainerHighCPU
        expr: rate(container_cpu_usage_seconds_total{name!=""}[5m]) * 100 > 80
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Container {{ .Labels.name }} high CPU usage"
          description: "Container {{ .Labels.name }} CPU usage is above 80% for 5 minutes"

      - alert: ContainerHighMemory
        expr: container_memory_usage_bytes{name!=""} / container_spec_memory_limit_bytes{name!=""} * 100 > 85
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Container {{ .Labels.name }} high memory usage"

      - alert: ContainerDown
        expr: absent(container_last_seen{name=~".+"})
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Container {{ .Labels.name }} is down"

      - alert: HighDiskUsage
        expr: (node_filesystem_size_bytes{mountpoint="/"} - node_filesystem_avail_bytes{mountpoint="/"}) / node_filesystem_size_bytes{mountpoint="/"} * 100 > 85
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Disk usage above 85%"
docker-compose.ymlWeb AppAPI ServerDatabaseCacheDocker Network:3000:8080:5432:6379

Docker Compose defines your entire application stack in a single YAML file.

Useful PromQL Queries

# CPU usage per container (percentage)
rate(container_cpu_usage_seconds_total{name!=""}[5m]) * 100

# Memory usage per container (MB)
container_memory_usage_bytes{name!=""} / 1024 / 1024

# Network received per container (bytes/sec)
rate(container_network_receive_bytes_total{name!=""}[5m])

# Network transmitted per container (bytes/sec)
rate(container_network_transmit_bytes_total{name!=""}[5m])

# Disk read per container (bytes/sec)
rate(container_fs_reads_bytes_total{name!=""}[5m])

# Container restart count
increase(container_restart_count{name!=""}[1h])

# Top 10 containers by memory usage
topk(10, container_memory_usage_bytes{name!=""})

Grafana Dashboard

Import the Docker Container dashboard (ID: 11600) from grafana.com for a pre-built view. For custom dashboards, use these panels:

Container Overview Panel

{
  "title": "Container CPU Usage",
  "type": "timeseries",
  "datasource": "Prometheus",
  "targets": [
    {
      "expr": "rate(container_cpu_usage_seconds_total{name!=''}[5m]) * 100",
      "legendFormat": "{{ name }}"
    }
  ],
  "fieldConfig": {
    "defaults": {
      "unit": "percent",
      "thresholds": {
        "steps": [
          { "value": 0, "color": "green" },
          { "value": 50, "color": "yellow" },
          { "value": 80, "color": "red" }
        ]
      }
    }
  }
}

Alertmanager Integration

Route Prometheus alerts to Ntfy for push notifications:

# alertmanager.yml
route:
  receiver: "ntfy"
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 4h

receivers:
  - name: "ntfy"
    webhook_configs:
      - url: "https://notify.example.com/ops-alerts"
        send_resolved: true
Monitoring DashboardCPU Usage23%Memory6.2 GBRequests/s1.2KUptime99.9%Response Time (ms)

Real-time monitoring dashboard showing CPU, memory, request rate, and response time trends.

Free Resource

CI/CD Pipeline Blueprint

Our battle-tested pipeline template covering build, test, security scan, staging, and zero-downtime deployment stages.

Get the Blueprint

Resource Usage

The monitoring stack itself is lightweight:

  • cAdvisor: ~30MB RAM
  • Prometheus (30d retention): ~200MB RAM, grows ~50MB/month on disk
  • Grafana: ~50MB RAM

Total overhead: ~280MB RAM for monitoring 50+ containers.

At TechSaaS, we run Grafana + Loki + Promtail for log aggregation and add cAdvisor + Prometheus for container metrics. The entire monitoring stack uses about 127MB and gives us full visibility into every container.

Need production monitoring for your Docker infrastructure? Contact [email protected].

#docker#monitoring#prometheus#cadvisor#grafana

Related Service

Platform Engineering

From CI/CD pipelines to service meshes, we create golden paths for your developers.

Need help with devops?

TechSaaS provides expert consulting and managed services for cloud infrastructure, DevOps, and AI/ML operations.

We Will Build You a Demo Site — For Free

Like it? Pay us. Do not like it? Walk away, zero complaints. You will spend way less than hiring developers or any agency.

47+ companies trusted us
99.99% uptime
< 48hr response

No spam. No contracts. Just a free demo.