Skip to main content

How to organize log management in Docker?

Log management in Docker is the discipline of capturing, rotating, and centralizing what containers write. The default (json-file) works for dev but hides surprises in production: a chatty container can fill disk in hours, lookups across many containers are tedious, and one node's logs vanish if the node dies. A real production stack: stdout-only apps + log driver with rotation + a shipper that gets logs off-host.

Theory

TL;DR

  • Twelve-Factor: write to stdout/stderr; let the platform handle the rest.
  • Log drivers (json-file, journald, syslog, fluentd, loki, awslogs, gcplogs, splunk, etc.) determine what happens to those streams.
  • Default json-file writes to /var/lib/docker/containers/<id>/<id>-json.log. No rotation by default.
  • Set rotation via log-opts: max-size, max-file.
  • For multi-host setups, ship logs off-node: Loki, ELK, CloudWatch, Datadog.
  • Structure your logs (JSON), not plaintext. Future-you will thank you.

Why stdout-only

If your app writes to a file inside the container:

  • Stops on docker logs — Docker only sees stdout/stderr.
  • Disappears with the container unless you mount a volume — and managing a volume per container is busywork.
  • Defeats log drivers — drivers operate on stdout/stderr.
  • Breaks orchestration — Swarm/K8s assume stdout/stderr semantics.

A twelve-factor app writes to stdout. The platform (Docker, Compose, Swarm, K8s) routes it where it belongs.

Built-in log drivers

DriverWhat it doesWhen to use
json-fileDefault. Writes JSON to disk on host.Dev, small ops.
localLike json-file but binary, faster, has rotation built-in.Small ops, lower overhead.
journaldSends to systemd journal.Linux hosts using journald centrally.
syslogSends to syslog daemon (local or remote).Legacy, simple central collection.
fluentdSends to a Fluentd/Fluent Bit aggregator.Mature, vendor-neutral central pipeline.
lokiSends to Grafana Loki.Modern, cheap, integrates with Grafana.
awslogsSends to AWS CloudWatch Logs.AWS deployments.
gcplogsSends to GCP Cloud Logging.GCP deployments.
splunkSends to Splunk HEC.Splunk shops.
noneDrops logs.Tests where you do not care.

Where logs end up by default

/var/lib/docker/containers/<id>/ ├── <id>-json.log # the active log file ├── <id>-json.log.1 # rotated (if rotation set) └── <id>-json.log.2.gz # rotated, compressed

Examples

Set host-wide rotation (production must-have)

json
// /etc/docker/daemon.json { "log-driver": "json-file", "log-opts": { "max-size": "100m", "max-file": "5", "compress": "true" } }
bash
sudo systemctl restart docker

Now every new container caps at 5 files of 100 MB = 500 MB worst case, with old files compressed.

Note: existing containers keep their old log config until recreated. If you set this on a server that is already running containers, those containers can still fill disk. Recreate or use per-container override.

Per-container override

bash
docker run -d \ --log-driver=json-file \ --log-opt max-size=50m \ --log-opt max-file=3 \ --name=app \ myorg/app

Useful when one chatty service needs different limits.

Compose with loki driver

yaml
services: loki: image: grafana/loki:2.9.0 ports: ["3100:3100"] grafana: image: grafana/grafana:10 ports: ["3000:3000"] depends_on: ["loki"] api: image: myorg/api:1.0 logging: driver: loki options: loki-url: "http://loki:3100/loki/api/v1/push" loki-retries: "5" loki-batch-size: "400" loki-external-labels: "job=api,env=prod"

loki-external-labels adds tags so you can filter by service in Grafana. This driver requires the docker-driver plugin (a one-time install): docker plugin install grafana/loki-docker-driver:latest --alias loki --grant-all-permissions.

Structured logging (JSON)

App side (Node.js example):

js
console.log(JSON.stringify({ level: 'info', msg: 'request handled', request_id: 'abc-123', user_id: 42, duration_ms: 87 }))

With json-file, Docker wraps this:

json
{"log": "{\"level\":\"info\",\"msg\":\"request handled\",...}\n", "stream": "stdout", "time": "..."}

The outer wrapper is Docker's; the inner JSON is your structured log. Loki, ELK, etc. parse the inner JSON and you can query {job="api"} |= "abc-123" or filter by user_id.

Fluentd / Fluent Bit pipeline

yaml
services: fluentbit: image: fluent/fluent-bit:2.2 ports: ["24224:24224"] volumes: - ./fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf:ro api: image: myorg/api:1.0 logging: driver: fluentd options: fluentd-address: localhost:24224 tag: api # If fluent-bit is unreachable, drop logs to avoid blocking the app: fluentd-async: "true"

A Fluent Bit config can route the same logs to multiple destinations: a hot search index for the last day, cold storage for compliance.

Reading logs from Docker (and limits of docker logs)

bash
docker logs -f --tail=100 api # tail and follow docker logs --since=10m api # last 10 minutes docker logs --until=2024-01-15T10:00:00 api # until a timestamp

docker logs works only with json-file, local, and journald drivers. With fluentd, loki, syslog, etc., the logs do not stay on disk and docker logs returns nothing — use the centralized backend instead.

CloudWatch on AWS

bash
docker run -d \ --log-driver=awslogs \ --log-opt awslogs-region=us-east-1 \ --log-opt awslogs-group=myapp \ --log-opt awslogs-stream=api-1 \ --log-opt awslogs-create-group=true \ myorg/api:1.0

The Docker daemon needs IAM permissions to write to CloudWatch (logs:CreateLogStream, logs:PutLogEvents). On EC2 with an instance role, add a policy; on Fargate, the task role handles it.

Kubernetes-style: kubelet + sidecar

Docker driver-less option: leave logs on disk via json-file, run a per-host shipper (Fluent Bit, Promtail, Vector) that reads /var/lib/docker/containers/*/*-json.log and ships them. Decouples app deploy from log destination.

Real-world usage

  • Tiny / hobby: json-file with rotation. Read with docker logs.
  • Single-host with monitoring: Loki + Promtail + Grafana, reading /var/lib/docker/containers.
  • Multi-host self-hosted: Fluent Bit per host → Elasticsearch/OpenSearch + Kibana, or Loki + Grafana.
  • Cloud: native — awslogs, gcplogs, Datadog, Splunk.
  • Compliance environments: long-term cold storage (S3) + hot index for the last 30 days.

Common mistakes

Writing to log files inside the container

A Java app with log4j writing to /var/log/app.log defeats docker logs and disappears with the container. Configure log4j to write to console (stdout) instead.

No rotation on json-file

Default Docker has no log rotation. A chatty app on default config will fill disk. Always set max-size and max-file.

Logging sensitive data

Passwords, JWTs, PII showing up in logs because someone console.log(req.body). Sanitize at the source. Centralized log indexing makes leaks easy to find — and easy to scrape.

Using docker logs in production for everything

Works on a single host but does not scale. Centralize early; trying to retroactively add log shipping mid-incident is painful.

Not setting compress: true

Rotated .log.N files unrotated and uncompressed sit on disk full size. compress: "true" gzips them, often 10x smaller.

Follow-up questions

Q: Why does docker logs return nothing for some containers?


A: That container uses a non-disk driver (fluentd, loki, awslogs). Logs are shipped, not stored. Query the destination instead.

Q: What is the difference between json-file and local?


A: Both store on disk on the host. local uses a binary format (smaller, faster), supports built-in rotation, and is recommended for new setups. json-file is older and the historical default. Functionally similar for docker logs.

Q: Should I log JSON or plaintext?


A: JSON. Modern log backends parse it natively, you can filter by fields (level=error, user_id=42). Plaintext is fine for tiny services but does not scale.

Q: (Senior) What are the trade-offs between fluentd-async: true and synchronous?


A: Synchronous: if Fluent Bit is down, the Docker daemon blocks on write(), which can hang containers. Async: Docker buffers (small, in-memory) and continues; on overflow, logs drop. For app stability, async with a generous buffer is safer; for compliance where log loss is unacceptable, run a high-availability log aggregator (failover, persistent queues) and use sync.

Q: (Senior) How do you handle multi-line stack traces in logs?


A: stdout writes each line as a separate event. A 30-line Java stack trace becomes 30 unrelated log entries. Two fixes: (1) Have the app emit a single JSON event with the full trace as a string field. (2) Use a log shipper (Fluent Bit, Filebeat, Vector) with multiline parsing rules that join lines starting with whitespace into the previous event. Source-side fix is more reliable.

Short Answer

Interview ready
Premium

A concise answer to help you respond confidently on this topic during an interview.

Comments

No comments yet