Log Management: ELK vs Loki vs Datadog — Cost, Scale, and Simplicity

Compare ELK Stack, Grafana Loki, and Datadog for log management. Storage costs, query performance, self-hosted vs SaaS, and when each makes sense.

Y
Yash Pritwani
14 min read

The Log Management Problem

Modern applications generate enormous volumes of logs. A single server running 50 Docker containers can produce gigabytes of logs per day. You need a system that:

Collects logs from all sources
Stores them efficiently (cost matters at scale)
Lets you search and filter quickly
Provides alerting on error patterns
Retains logs for compliance requirements

<div style="margin:2.5rem auto;max-width:600px;width:100%;text-align:center;"><svg viewBox="0 0 600 200" xmlns="http://www.w3.org/2000/svg" style="width:100%;height:auto;"><rect width="600" height="200" rx="12" fill="#1a1a2e"/><rect x="15" y="10" width="570" height="25" rx="6" fill="#6366f1" opacity="0.3"/><circle cx="30" cy="22" r="4" fill="#ef4444"/><circle cx="42" cy="22" r="4" fill="#f59e0b"/><circle cx="54" cy="22" r="4" fill="#2dd4bf"/><text x="300" y="27" text-anchor="middle" fill="#ffffff" font-size="10" font-family="system-ui">Monitoring Dashboard</text><rect x="20" y="45" width="130" height="55" rx="6" fill="#6366f1" opacity="0.2"/><text x="85" y="65" text-anchor="middle" fill="#94a3b8" font-size="9" font-family="system-ui">CPU Usage</text><text x="85" y="88" text-anchor="middle" fill="#2dd4bf" font-size="18" font-family="system-ui" font-weight="bold">23%</text><rect x="160" y="45" width="130" height="55" rx="6" fill="#6366f1" opacity="0.2"/><text x="225" y="65" text-anchor="middle" fill="#94a3b8" font-size="9" font-family="system-ui">Memory</text><text x="225" y="88" text-anchor="middle" fill="#f59e0b" font-size="18" font-family="system-ui" font-weight="bold">6.2 GB</text><rect x="300" y="45" width="130" height="55" rx="6" fill="#6366f1" opacity="0.2"/><text x="365" y="65" text-anchor="middle" fill="#94a3b8" font-size="9" font-family="system-ui">Requests/s</text><text x="365" y="88" text-anchor="middle" fill="#6366f1" font-size="18" font-family="system-ui" font-weight="bold">1.2K</text><rect x="440" y="45" width="140" height="55" rx="6" fill="#6366f1" opacity="0.2"/><text x="510" y="65" text-anchor="middle" fill="#94a3b8" font-size="9" font-family="system-ui">Uptime</text><text x="510" y="88" text-anchor="middle" fill="#2dd4bf" font-size="18" font-family="system-ui" font-weight="bold">99.9%</text><rect x="20" y="110" width="560" height="80" rx="6" fill="#6366f1" opacity="0.1"/><text x="45" y="125" fill="#94a3b8" font-size="8" font-family="system-ui">Response Time (ms)</text><polyline points="40,170 80,155 120,160 160,140 200,145 240,135 280,150 320,130 360,125 400,140 440,120 480,115 520,125 560,110" fill="none" stroke="#6366f1" stroke-width="2"/><polyline points="40,170 80,155 120,160 160,140 200,145 240,135 280,150 320,130 360,125 400,140 440,120 480,115 520,125 560,110" fill="url(#chartGrad)" stroke="none" opacity="0.3"/><defs><linearGradient id="chartGrad" x1="0" y1="0" x2="0" y2="1"><stop offset="0%" stop-color="#6366f1"/><stop offset="100%" stop-color="transparent"/></linearGradient></defs><line x1="40" y1="130" x2="560" y2="130" stroke="#e2e8f0" stroke-width="0.3" opacity="0.2"/><line x1="40" y1="150" x2="560" y2="150" stroke="#e2e8f0" stroke-width="0.3" opacity="0.2"/><line x1="40" y1="170" x2="560" y2="170" stroke="#e2e8f0" stroke-width="0.3" opacity="0.2"/></svg><p style="margin-top:0.75rem;font-size:0.85rem;color:#94a3b8;font-style:italic;line-height:1.4;">Real-time monitoring dashboard showing CPU, memory, request rate, and response time trends.</p></div>

ELK Stack: The Established Giant

ELK (Elasticsearch, Logstash, Kibana) has been the standard for log management since 2012. It is incredibly powerful but resource-hungry.

Architecture:

Applications → Filebeat → Logstash → Elasticsearch → Kibana
                                  (or Filebeat → Elasticsearch directly)
# docker-compose.yml for ELK
services:
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:8.15.0
    environment:
      - discovery.type=single-node
      - xpack.security.enabled=false
      - "ES_JAVA_OPTS=-Xms1g -Xmx1g"
    volumes:
      - es-data:/usr/share/elasticsearch/data
    mem_limit: 2g

  kibana:
    image: docker.elastic.co/kibana/kibana:8.15.0
    environment:
      ELASTICSEARCH_HOSTS: http://elasticsearch:9200
    ports:
      - "5601:5601"
    mem_limit: 512m

  filebeat:
    image: docker.elastic.co/beats/filebeat:8.15.0
    volumes:
      - /var/lib/docker/containers:/var/lib/docker/containers:ro
      - /var/run/docker.sock:/var/run/docker.sock:ro
    mem_limit: 256m

Elasticsearch query example:

{
  "query": {
    "bool": {
      "must": [
        { "match": { "container.name": "api-server" } },
        { "range": { "@timestamp": { "gte": "now-1h" } } }
      ],
      "filter": [
        { "term": { "level": "error" } }
      ]
    }
  },
  "sort": [{ "@timestamp": { "order": "desc" } }],
  "size": 100
}

Grafana Loki: The Lightweight Alternative

Loki is designed by Grafana Labs as a "Prometheus for logs." Unlike Elasticsearch, Loki does not index log content — it only indexes labels (metadata). This makes it dramatically cheaper to run.

Architecture:

Applications → Promtail → Loki → Grafana
                     (or Alloy, Vector, Fluentd)
# docker-compose.yml for Loki stack
services:
  loki:
    image: grafana/loki:3.3.0
    command: -config.file=/etc/loki/config.yaml
    volumes:
      - ./loki/config.yaml:/etc/loki/config.yaml
      - loki-data:/loki
    mem_limit: 256m

  promtail:
    image: grafana/promtail:3.3.0
    command: -config.file=/etc/promtail/config.yaml
    volumes:
      - /var/log:/var/log:ro
      - /var/lib/docker/containers:/var/lib/docker/containers:ro
      - ./promtail/config.yaml:/etc/promtail/config.yaml
    mem_limit: 128m

  grafana:
    image: grafana/grafana:11.4.0
    environment:
      GF_AUTH_ANONYMOUS_ENABLED: "true"
    ports:
      - "3000:3000"
    mem_limit: 256m

Loki configuration:

# loki/config.yaml
auth_enabled: false

server:
  http_listen_port: 3100

common:
  ring:
    instance_addr: 127.0.0.1
    kvstore:
      store: inmemory
  replication_factor: 1
  path_prefix: /loki

schema_config:
  configs:
    - from: 2024-01-01
      store: tsdb
      object_store: filesystem
      schema: v13
      index:
        prefix: index_
        period: 24h

storage_config:
  filesystem:
    directory: /loki/chunks

limits_config:
  retention_period: 30d
  max_query_length: 30d

compactor:
  working_directory: /loki/compactor
  retention_enabled: true

LogQL query examples:

# All errors from api container
{container="api-server"} |= "error"

# Parse JSON logs and filter
{job="docker"} | json | level="error" | status >= 500

# Count errors per minute
count_over_time({container="api-server"} |= "error" [1m])

# Top 10 error messages
topk(10, sum by (message) (count_over_time({container="api-server"} | json | level="error" [1h])))

<div style="margin:2.5rem auto;max-width:600px;width:100%;text-align:center;"><svg viewBox="0 0 600 190" xmlns="http://www.w3.org/2000/svg" style="width:100%;height:auto;"><rect width="600" height="190" rx="12" fill="#0d1117"/><rect x="0" y="0" width="600" height="28" rx="12" fill="#1c2333"/><rect x="0" y="12" width="600" height="16" fill="#1c2333"/><circle cx="18" cy="14" r="5" fill="#ef4444"/><circle cx="34" cy="14" r="5" fill="#f59e0b"/><circle cx="50" cy="14" r="5" fill="#2dd4bf"/><text x="300" y="18" text-anchor="middle" fill="#94a3b8" font-size="10" font-family="monospace">Terminal</text><text x="20" y="50" fill="#2dd4bf" font-size="11" font-family="monospace">$</text><text x="35" y="50" fill="#e2e8f0" font-size="11" font-family="monospace">docker compose up -d</text><text x="20" y="70" fill="#94a3b8" font-size="11" font-family="monospace">[+] Running 5/5</text><text x="20" y="88" fill="#2dd4bf" font-size="10" font-family="monospace"> &#x2713;</text><text x="38" y="88" fill="#94a3b8" font-size="10" font-family="monospace">Network app_default Created</text><text x="20" y="106" fill="#2dd4bf" font-size="10" font-family="monospace"> &#x2713;</text><text x="38" y="106" fill="#94a3b8" font-size="10" font-family="monospace">Container web Started</text><text x="20" y="124" fill="#2dd4bf" font-size="10" font-family="monospace"> &#x2713;</text><text x="38" y="124" fill="#94a3b8" font-size="10" font-family="monospace">Container api Started</text><text x="20" y="142" fill="#2dd4bf" font-size="10" font-family="monospace"> &#x2713;</text><text x="38" y="142" fill="#94a3b8" font-size="10" font-family="monospace">Container db Started</text><text x="20" y="165" fill="#2dd4bf" font-size="11" font-family="monospace">$</text><rect x="35" y="155" width="8" height="14" fill="#e2e8f0" opacity="0.7"/></svg><p style="margin-top:0.75rem;font-size:0.85rem;color:#94a3b8;font-style:italic;line-height:1.4;">Docker Compose brings up your entire stack with a single command.</p></div>

Datadog: The SaaS Powerhouse

Datadog is a cloud-hosted observability platform that combines logs, metrics, traces, and more in a single pane. No infrastructure to manage.

# Docker agent for Datadog
services:
  datadog-agent:
    image: gcr.io/datadoghq/agent:7
    environment:
      DD_API_KEY: your-api-key
      DD_LOGS_ENABLED: "true"
      DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL: "true"
      DD_CONTAINER_EXCLUDE: "name:datadog-agent"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - /var/lib/docker/containers:/var/lib/docker/containers:ro
      - /proc/:/host/proc/:ro
      - /sys/fs/cgroup/:/host/sys/fs/cgroup:ro

Comparison

Feature
ELK Stack
Grafana Loki
Datadog

|---------|-----------|--------------|---------|

Deployment
Self-hosted
Self-hosted
SaaS
Full-text search
Yes (inverted index)
No (grep-like)
Yes
Query language
KQL / Lucene
LogQL
Custom
Storage efficiency
Low (indexes everything)
High (labels only)
N/A (managed)
RAM requirement
2-4GB minimum
256MB minimum
N/A
Disk usage (1M logs)
~2-3GB
~200-400MB
N/A
Setup complexity
High
Low
Very low
Dashboards
Kibana
Grafana
Built-in
Alerting
Watcher / ElastAlert
Grafana alerting
Built-in
Log parsing
Logstash / Ingest pipelines
Promtail / LogQL
Automatic
Correlation
Manual
With Tempo (traces)
Automatic
Cost (self-hosted, 10GB/day)
~$100/month (infra)
~$20/month (infra)
N/A
Cost (cloud, 10GB/day)
Elastic Cloud ~$300/month
Grafana Cloud ~$50/month
~$250/month
Best for
Full-text search at scale
Cost-efficient log aggregation
Teams without ops capacity

Resource Usage: Real Numbers

Running the same workload (50 containers, ~5GB logs/day):

Component
ELK Stack
Loki Stack

|-----------|-----------|------------|

Log store RAM
2GB (Elasticsearch)
200MB (Loki)
Agent RAM
200MB (Filebeat)
80MB (Promtail)
Dashboard RAM
400MB (Kibana)
200MB (Grafana)
Total RAM
2.6GB
480MB
Disk (30 days)
~45GB
~8GB

Loki uses 5x less RAM and 5x less disk for the same log volume. The tradeoff: no full-text search. You grep through logs instead of searching an inverted index.

<div style="margin:2.5rem auto;max-width:600px;width:100%;text-align:center;"><svg viewBox="0 0 600 170" xmlns="http://www.w3.org/2000/svg" style="width:100%;height:auto;"><rect width="600" height="170" rx="12" fill="#1a1a2e"/><path d="M80,90 Q80,50 120,50 Q130,30 160,35 Q190,25 200,50 Q230,45 230,70 Q240,90 210,95 L100,95 Q70,95 80,90 Z" fill="none" stroke="#3b82f6" stroke-width="1.5"/><text x="155" y="75" text-anchor="middle" fill="#3b82f6" font-size="11" font-family="system-ui">Cloud</text><text x="155" y="120" text-anchor="middle" fill="#94a3b8" font-size="9" font-family="system-ui">$5,000/mo</text><defs><marker id="arrow9" markerWidth="10" markerHeight="7" refX="10" refY="3.5" orient="auto"><path d="M0,0 L10,3.5 L0,7" fill="#2dd4bf"/></marker></defs><line x1="245" y1="70" x2="340" y2="70" stroke="#2dd4bf" stroke-width="2.5" marker-end="url(#arrow9)"/><text x="293" y="60" text-anchor="middle" fill="#2dd4bf" font-size="10" font-family="system-ui" font-weight="bold">Migrate</text><rect x="355" y="35" width="180" height="70" rx="8" fill="none" stroke="#6366f1" stroke-width="2"/><rect x="365" y="45" width="160" height="15" rx="3" fill="#6366f1" opacity="0.7"/><rect x="365" y="65" width="160" height="15" rx="3" fill="#a855f7" opacity="0.7"/><rect x="365" y="85" width="100" height="10" rx="2" fill="#2dd4bf" opacity="0.5"/><text x="445" y="57" text-anchor="middle" fill="#ffffff" font-size="9" font-family="system-ui">Bare Metal</text><text x="445" y="77" text-anchor="middle" fill="#ffffff" font-size="9" font-family="system-ui">Docker + LXC</text><text x="445" y="120" text-anchor="middle" fill="#94a3b8" font-size="9" font-family="system-ui">$200/mo</text><text x="300" y="150" text-anchor="middle" fill="#2dd4bf" font-size="11" font-family="system-ui" font-weight="bold">96% cost reduction</text></svg><p style="margin-top:0.75rem;font-size:0.85rem;color:#94a3b8;font-style:italic;line-height:1.4;">Cloud to self-hosted migration can dramatically reduce infrastructure costs while maintaining full control.</p></div>

When to Choose Each

Choose ELK when: You need full-text search across log content, you have a dedicated ops team, you process 100GB+ logs/day, or you need complex log analytics.

Choose Loki when: You want minimal resource usage, you already use Grafana, label-based filtering is sufficient, or you are cost-conscious about storage.

Choose Datadog when: You do not want to manage infrastructure, you need integrated logs+metrics+traces, your team is small, or your budget allows SaaS pricing.

At TechSaaS, we run Loki + Promtail + Grafana for our entire log stack. It uses about 480MB total RAM for 50+ containers, and the integration with Grafana gives us dashboards, alerts, and log exploration in one place. The total footprint of our observability stack (Loki + Promtail + Grafana) is 127MB of Docker images. For most self-hosted infrastructure, Loki is the clear winner.

#logging#elk#loki#datadog#grafana#observability

Need help with devops?

TechSaaS provides expert consulting and managed services for cloud infrastructure, DevOps, and AI/ML operations.