Skip to main content

How to share data between Docker containers?

Sharing data between containers has three good answers depending on what you mean by "data". Files? Shared volume. Messages? Network calls. Broadcast? A queue. Picking the right one matters because the wrong choice creates concurrency bugs at scale.

Theory

TL;DR

  • Files (read-mostly or single-writer): named volume mounted into multiple containers.
  • Service-to-service (RPC, HTTP): Docker network. Containers call each other by name.
  • Many writers, queue semantics: message queue (Redis, RabbitMQ, Kafka).
  • Read-only config: bind-mount the same file into each container, or use Docker configs / Swarm secrets.
  • Avoid --volumes-from for new code — works but is the legacy form. Use named volumes.

Pattern 1: shared named volume

bash
docker volume create shared docker run -d --name writer \ -v shared:/data \ alpine sh -c 'while true; do date >> /data/log.txt; sleep 5; done' docker run --rm -v shared:/data alpine cat /data/log.txt # Wed Apr 30 12:00:00 UTC 2026 # Wed Apr 30 12:00:05 UTC 2026 # ...

Both containers see the same /data/log.txt. Concurrent writes need application-level coordination (the volume itself does not lock).

Pattern 2: network communication

Often the right answer when "data" is actually a service or RPC, not files.

yaml
services: api: image: myapp environment: WORKER_URL: http://worker:5000 worker: image: myworker expose: ["5000"]

api calls http://worker:5000/.... The data shared is the API contract, not files.

When to prefer this over a shared volume:

  • Multiple writers from different processes (race conditions in shared files vs sequential API calls).
  • Cross-host operation (volumes are local; networks span hosts via overlay).
  • Versioning the contract (HTTP/gRPC has explicit schemas; raw files do not).

Pattern 3: message queue

For broadcast, fan-out, async work distribution:

yaml
services: redis: image: redis:7 producer: image: myproducer environment: { REDIS_URL: redis://redis:6379 } depends_on: [redis] worker: image: myworker deploy: { replicas: 5 } environment: { REDIS_URL: redis://redis:6379 } depends_on: [redis]

Producer pushes messages to Redis. Five workers pull and process. The data flows through the queue, not through a shared filesystem.

Same pattern with RabbitMQ, Kafka, NATS. Pick by guarantees needed (at-least-once, at-most-once, ordering, persistence).

Pattern 4: read-only config sharing

Multiple containers need the same config file:

yaml
services: web: image: nginx volumes: - ./shared.conf:/etc/myapp/config.conf:ro api: image: myapp volumes: - ./shared.conf:/etc/myapp/config.conf:ro

Bind-mount the same host file into both. :ro makes it read-only. Edit on host, restart containers, both pick up the new config.

For Swarm or K8s, use configs (docker config create) and secrets (docker secret create) — they distribute files across the cluster.

--volumes-from (legacy)

bash
docker run -d --name data -v /shared alpine docker run --rm --volumes-from data alpine ls /shared

The second container inherits the first's volume mounts. Originally used for "data containers" — a pattern from before named volumes existed. Today, named volumes are cleaner. Avoid --volumes-from in new code.

Concurrent access pitfalls

A shared volume is just a directory. The OS does not magically serialize writes across containers. Common gotchas:

bash
# Container A writes file.json # Container B reads file.json AT THE SAME TIME # B might read partial write → JSON parse error

Mitigations:

  • Atomic writes: write to tmp.json then mv tmp.json file.json (atomic on same filesystem).
  • File locks: flock, but coordinate across containers requires a shared lock file.
  • Single-writer convention: only one container ever writes; others read.
  • Use a queue or DB instead when concurrent access is the norm.

Common mistakes

Sharing a volume for what should be RPC

bash
# WRONG: api writes a request file, worker watches the directory docker run -d --name api -v shared:/queue myapp docker run -d --name worker -v shared:/queue myworker # Race conditions, file locking nightmares. # RIGHT: api calls worker via HTTP # OR producer pushes to Redis, workers consume

File-as-message is the pattern that drove people to invent message queues in the first place.

Bind-mounting host paths and assuming portability

yaml
volumes: - /home/user/myapp/data:/data # Works on host A. Fails on host B if /home/user/myapp/data does not exist.

Named volumes are portable; bind mounts tie you to a specific host filesystem. For shared data on multi-host, use a network volume driver or a network service.

Forgetting permissions across containers

Container A runs as UID 1000, writes files. Container B runs as root, no problem. Container C runs as UID 1001 — cannot read what A wrote (perm 700, owner 1000).

Pick a UID strategy: same UID across containers, or files written with permissive modes (umask 002).

Mounting the same volume read-write everywhere

Useful when many readers do not need to write. Mount writers RW, readers RO:

yaml
writer: volumes: [shared:/data] reader-1: volumes: [shared:/data:ro] # ← read-only reader-2: volumes: [shared:/data:ro]

Catches accidental writes from readers.

Real-world usage

  • Web + reverse proxy sharing certs: both nginx and certbot mount /etc/letsencrypt so certbot writes, nginx reads.
  • App + log shipper sidecar: both mount a /var/log/app volume; app writes, fluentd reads and ships.
  • Backup container: read-only mount of the production volume, runs tar to backup destination.
  • Build artifacts: CI builds in container A, container B (publish) reads the result via a shared volume mounted to both.
  • Init container pattern (K8s, also doable in Compose with service_completed_successfully): init writes config, app reads it.

Follow-up questions

Q: Can two containers write to the same file simultaneously?


A: Yes, but the OS does not coordinate. Last-write-wins or partial writes are likely. Application-level coordination (locks, atomic writes, append-only logs) is your responsibility.

Q: Are shared volumes safe across host reboots?


A: Yes — volumes are persistent. Reboot the host, restart the containers with the same volume mounts, data is intact.

Q: What is the difference between a shared volume and a bind mount of the same host path?


A: Functionally similar — both let multiple containers see the same files. Volumes are Docker-managed (portable, lifecycle controlled), bind mounts pin to a host path (not portable).

Q: How do I share data between containers on different hosts?


A: A few options. (1) Network service (DB, Redis) accessible from both hosts. (2) Network filesystem (NFS, GlusterFS, EFS) mounted on both hosts and bind-mounted into containers. (3) Object storage (S3, Minio) with both containers as clients.

Q: (Senior) When does sharing a volume become an architectural smell?


A: When two services share state without a clear owner — both write to the same files. That coupling makes scaling them independently impossible: one's bug becomes the other's problem. The smell is followed by the fix: introduce an owning service (DB, queue) that becomes the single source of truth, with both "sharers" as clients. Filesystem-as-IPC is fine for caches and immutable data; for live mutable state, prefer a service.

Examples

Sidecar log shipper pattern

yaml
services: app: image: myapp volumes: - applogs:/var/log/app fluentd: image: fluent/fluentd volumes: - applogs:/fluentd/log:ro # read-only — fluentd does not write app logs - ./fluent.conf:/fluentd/etc/fluent.conf:ro depends_on: [app] volumes: applogs:

App writes logs to a volume, fluentd reads them and ships to an aggregator. Loose coupling, single owner of writes.

Build-and-publish pipeline

yaml
services: build: image: builder volumes: - artifact:/out command: build-script.sh restart: "no" publish: image: publisher volumes: - artifact:/in:ro # publisher does not modify depends_on: build: condition: service_completed_successfully restart: "no" volumes: artifact:

Build stage produces files, publish stage consumes them. The volume is the handoff. service_completed_successfully makes the order explicit.

Cert sharing between web and certbot

yaml
services: web: image: nginx:1.27-alpine volumes: - certs:/etc/letsencrypt:ro - ./nginx.conf:/etc/nginx/nginx.conf:ro ports: ["80:80", "443:443"] certbot: image: certbot/certbot volumes: - certs:/etc/letsencrypt # certbot writes - ./acme-challenge:/var/www/acme-challenge entrypoint: | sh -c "trap exit TERM; while :; do certbot renew --webroot -w /var/www/acme-challenge --quiet nginx -s reload sleep 12h & wait $${!}; done"

Certbot writes certs to the volume; nginx reads them. Renewal happens periodically; nginx reload picks up new certs without container restart.

Short Answer

Interview ready
Premium

A concise answer to help you respond confidently on this topic during an interview.

Comments

No comments yet