Log Management: ELK vs Loki vs Datadog — Cost, Scale, and Simplicity
Compare ELK Stack, Grafana Loki, and Datadog for log management. Storage costs, query performance, self-hosted vs SaaS, and when each makes sense.
The Log Management Problem
Modern applications generate enormous volumes of logs. A single server running 50 Docker containers can produce gigabytes of logs per day. You need a system that:
- Collects logs from all sources
- Stores them efficiently (cost matters at scale)
- Lets you search and filter quickly
- Provides alerting on error patterns
- Retains logs for compliance requirements
Real-time monitoring dashboard showing CPU, memory, request rate, and response time trends.
ELK Stack: The Established Giant
ELK (Elasticsearch, Logstash, Kibana) has been the standard for log management since 2012. It is incredibly powerful but resource-hungry.
Architecture:
Applications → Filebeat → Logstash → Elasticsearch → Kibana
(or Filebeat → Elasticsearch directly)
# docker-compose.yml for ELK
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:8.15.0
environment:
- discovery.type=single-node
- xpack.security.enabled=false
- "ES_JAVA_OPTS=-Xms1g -Xmx1g"
volumes:
- es-data:/usr/share/elasticsearch/data
mem_limit: 2g
kibana:
image: docker.elastic.co/kibana/kibana:8.15.0
environment:
ELASTICSEARCH_HOSTS: http://elasticsearch:9200
ports:
- "5601:5601"
mem_limit: 512m
filebeat:
image: docker.elastic.co/beats/filebeat:8.15.0
volumes:
- /var/lib/docker/containers:/var/lib/docker/containers:ro
- /var/run/docker.sock:/var/run/docker.sock:ro
mem_limit: 256m
Get more insights on DevOps
Join 2,000+ engineers who get our weekly deep-dives. No spam, unsubscribe anytime.
Elasticsearch query example:
{
"query": {
"bool": {
"must": [
{ "match": { "container.name": "api-server" } },
{ "range": { "@timestamp": { "gte": "now-1h" } } }
],
"filter": [
{ "term": { "level": "error" } }
]
}
},
"sort": [{ "@timestamp": { "order": "desc" } }],
"size": 100
}
Grafana Loki: The Lightweight Alternative
Loki is designed by Grafana Labs as a "Prometheus for logs." Unlike Elasticsearch, Loki does not index log content — it only indexes labels (metadata). This makes it dramatically cheaper to run.
Architecture:
Applications → Promtail → Loki → Grafana
(or Alloy, Vector, Fluentd)
# docker-compose.yml for Loki stack
services:
loki:
image: grafana/loki:3.3.0
command: -config.file=/etc/loki/config.yaml
volumes:
- ./loki/config.yaml:/etc/loki/config.yaml
- loki-data:/loki
mem_limit: 256m
promtail:
image: grafana/promtail:3.3.0
command: -config.file=/etc/promtail/config.yaml
volumes:
- /var/log:/var/log:ro
- /var/lib/docker/containers:/var/lib/docker/containers:ro
- ./promtail/config.yaml:/etc/promtail/config.yaml
mem_limit: 128m
grafana:
image: grafana/grafana:11.4.0
environment:
GF_AUTH_ANONYMOUS_ENABLED: "true"
ports:
- "3000:3000"
mem_limit: 256m
Loki configuration:
# loki/config.yaml
auth_enabled: false
server:
http_listen_port: 3100
common:
ring:
instance_addr: 127.0.0.1
kvstore:
store: inmemory
replication_factor: 1
path_prefix: /loki
schema_config:
configs:
- from: 2024-01-01
store: tsdb
object_store: filesystem
schema: v13
index:
prefix: index_
period: 24h
storage_config:
filesystem:
directory: /loki/chunks
limits_config:
retention_period: 30d
max_query_length: 30d
compactor:
working_directory: /loki/compactor
retention_enabled: true
LogQL query examples:
# All errors from api container
{container="api-server"} |= "error"
# Parse JSON logs and filter
{job="docker"} | json | level="error" | status >= 500
# Count errors per minute
count_over_time({container="api-server"} |= "error" [1m])
# Top 10 error messages
topk(10, sum by (message) (count_over_time({container="api-server"} | json | level="error" [1h])))
Docker Compose brings up your entire stack with a single command.
Datadog: The SaaS Powerhouse
Datadog is a cloud-hosted observability platform that combines logs, metrics, traces, and more in a single pane. No infrastructure to manage.
# Docker agent for Datadog
services:
datadog-agent:
image: gcr.io/datadoghq/agent:7
environment:
DD_API_KEY: your-api-key
DD_LOGS_ENABLED: "true"
DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL: "true"
DD_CONTAINER_EXCLUDE: "name:datadog-agent"
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- /var/lib/docker/containers:/var/lib/docker/containers:ro
- /proc/:/host/proc/:ro
- /sys/fs/cgroup/:/host/sys/fs/cgroup:ro
Comparison
| Feature | ELK Stack | Grafana Loki | Datadog |
|---|---|---|---|
| Deployment | Self-hosted | Self-hosted | SaaS |
| Full-text search | Yes (inverted index) | No (grep-like) | Yes |
| Query language | KQL / Lucene | LogQL | Custom |
| Storage efficiency | Low (indexes everything) | High (labels only) | N/A (managed) |
| RAM requirement | 2-4GB minimum | 256MB minimum | N/A |
| Disk usage (1M logs) | ~2-3GB | ~200-400MB | N/A |
| Setup complexity | High | Low | Very low |
| Dashboards | Kibana | Grafana | Built-in |
| Alerting | Watcher / ElastAlert | Grafana alerting | Built-in |
| Log parsing | Logstash / Ingest pipelines | Promtail / LogQL | Automatic |
| Correlation | Manual | With Tempo (traces) | Automatic |
| Cost (self-hosted, 10GB/day) | ~$100/month (infra) | ~$20/month (infra) | N/A |
| Cost (cloud, 10GB/day) | Elastic Cloud ~$300/month | Grafana Cloud ~$50/month | ~$250/month |
| Best for | Full-text search at scale | Cost-efficient log aggregation | Teams without ops capacity |
Free Resource
CI/CD Pipeline Blueprint
Our battle-tested pipeline template covering build, test, security scan, staging, and zero-downtime deployment stages.
Resource Usage: Real Numbers
Running the same workload (50 containers, ~5GB logs/day):
| Component | ELK Stack | Loki Stack |
|---|---|---|
| Log store RAM | 2GB (Elasticsearch) | 200MB (Loki) |
| Agent RAM | 200MB (Filebeat) | 80MB (Promtail) |
| Dashboard RAM | 400MB (Kibana) | 200MB (Grafana) |
| Total RAM | 2.6GB | 480MB |
| Disk (30 days) | ~45GB | ~8GB |
Loki uses 5x less RAM and 5x less disk for the same log volume. The tradeoff: no full-text search. You grep through logs instead of searching an inverted index.
Cloud to self-hosted migration can dramatically reduce infrastructure costs while maintaining full control.
When to Choose Each
Choose ELK when: You need full-text search across log content, you have a dedicated ops team, you process 100GB+ logs/day, or you need complex log analytics.
Choose Loki when: You want minimal resource usage, you already use Grafana, label-based filtering is sufficient, or you are cost-conscious about storage.
Choose Datadog when: You do not want to manage infrastructure, you need integrated logs+metrics+traces, your team is small, or your budget allows SaaS pricing.
At TechSaaS, we run Loki + Promtail + Grafana for our entire log stack. It uses about 480MB total RAM for 50+ containers, and the integration with Grafana gives us dashboards, alerts, and log exploration in one place. The total footprint of our observability stack (Loki + Promtail + Grafana) is 127MB of Docker images. For most self-hosted infrastructure, Loki is the clear winner.
Related Service
Platform Engineering
From CI/CD pipelines to service meshes, we create golden paths for your developers.
Need help with devops?
TechSaaS provides expert consulting and managed services for cloud infrastructure, DevOps, and AI/ML operations.
We Will Build You a Demo Site — For Free
Like it? Pay us. Do not like it? Walk away, zero complaints. You will spend way less than hiring developers or any agency.
No spam. No contracts. Just a free demo.