How to share data between Docker containers?

docs.questions.sections.docker~5 min read

Sharing data between containers has three good answers depending on what you mean by "data". Files? Shared volume. Messages? Network calls. Broadcast? A queue. Picking the right one matters because the wrong choice creates concurrency bugs at scale.

Theory

TL;DR

Files (read-mostly or single-writer): named volume mounted into multiple containers.
Service-to-service (RPC, HTTP): Docker network. Containers call each other by name.
Many writers, queue semantics: message queue (Redis, RabbitMQ, Kafka).
Read-only config: bind-mount the same file into each container, or use Docker configs / Swarm secrets.
Avoid --volumes-from for new code — works but is the legacy form. Use named volumes.

Pattern 1: shared named volume

bash

docker volume create shared

docker run -d --name writer \
    -v shared:/data \
    alpine sh -c 'while true; do date >> /data/log.txt; sleep 5; done'

docker run --rm -v shared:/data alpine cat /data/log.txt
# Wed Apr 30 12:00:00 UTC 2026
# Wed Apr 30 12:00:05 UTC 2026
# ...

Both containers see the same /data/log.txt. Concurrent writes need application-level coordination (the volume itself does not lock).

Pattern 2: network communication

Often the right answer when "data" is actually a service or RPC, not files.

yaml

services:
  api:
    image: myapp
    environment:
      WORKER_URL: http://worker:5000
  worker:
    image: myworker
    expose: ["5000"]

api calls http://worker:5000/.... The data shared is the API contract, not files.

When to prefer this over a shared volume:

Multiple writers from different processes (race conditions in shared files vs sequential API calls).
Cross-host operation (volumes are local; networks span hosts via overlay).
Versioning the contract (HTTP/gRPC has explicit schemas; raw files do not).

Pattern 3: message queue

For broadcast, fan-out, async work distribution:

yaml

services:
  redis:
    image: redis:7
  producer:
    image: myproducer
    environment: { REDIS_URL: redis://redis:6379 }
    depends_on: [redis]
  worker:
    image: myworker
    deploy: { replicas: 5 }
    environment: { REDIS_URL: redis://redis:6379 }
    depends_on: [redis]

Producer pushes messages to Redis. Five workers pull and process. The data flows through the queue, not through a shared filesystem.

Same pattern with RabbitMQ, Kafka, NATS. Pick by guarantees needed (at-least-once, at-most-once, ordering, persistence).

Multiple containers need the same config file:

yaml

services:
  web:
    image: nginx
    volumes:
      - ./shared.conf:/etc/myapp/config.conf:ro
  api:
    image: myapp
    volumes:
      - ./shared.conf:/etc/myapp/config.conf:ro

Bind-mount the same host file into both. :ro makes it read-only. Edit on host, restart containers, both pick up the new config.

For Swarm or K8s, use configs (docker config create) and secrets (docker secret create) — they distribute files across the cluster.

`--volumes-from` (legacy)

bash

docker run -d --name data -v /shared alpine
docker run --rm --volumes-from data alpine ls /shared

The second container inherits the first's volume mounts. Originally used for "data containers" — a pattern from before named volumes existed. Today, named volumes are cleaner. Avoid --volumes-from in new code.

Concurrent access pitfalls

A shared volume is just a directory. The OS does not magically serialize writes across containers. Common gotchas:

bash

# Container A writes file.json
# Container B reads file.json AT THE SAME TIME
# B might read partial write → JSON parse error

Mitigations:

Atomic writes: write to tmp.json then mv tmp.json file.json (atomic on same filesystem).
File locks: flock, but coordinate across containers requires a shared lock file.
Single-writer convention: only one container ever writes; others read.
Use a queue or DB instead when concurrent access is the norm.

Common mistakes

Sharing a volume for what should be RPC

bash

# WRONG: api writes a request file, worker watches the directory
docker run -d --name api -v shared:/queue myapp
docker run -d --name worker -v shared:/queue myworker
# Race conditions, file locking nightmares.

# RIGHT: api calls worker via HTTP
# OR producer pushes to Redis, workers consume

File-as-message is the pattern that drove people to invent message queues in the first place.

Bind-mounting host paths and assuming portability

yaml

volumes:
  - /home/user/myapp/data:/data
# Works on host A. Fails on host B if /home/user/myapp/data does not exist.

Named volumes are portable; bind mounts tie you to a specific host filesystem. For shared data on multi-host, use a network volume driver or a network service.

Forgetting permissions across containers

Container A runs as UID 1000, writes files. Container B runs as root, no problem. Container C runs as UID 1001 — cannot read what A wrote (perm 700, owner 1000).

Pick a UID strategy: same UID across containers, or files written with permissive modes (umask 002).

Mounting the same volume read-write everywhere

Useful when many readers do not need to write. Mount writers RW, readers RO:

yaml

  writer:
    volumes: [shared:/data]
  reader-1:
    volumes: [shared:/data:ro]   # ← read-only
  reader-2:
    volumes: [shared:/data:ro]

Catches accidental writes from readers.

Real-world usage

Web + reverse proxy sharing certs: both nginx and certbot mount /etc/letsencrypt so certbot writes, nginx reads.
App + log shipper sidecar: both mount a /var/log/app volume; app writes, fluentd reads and ships.
Backup container: read-only mount of the production volume, runs tar to backup destination.
Build artifacts: CI builds in container A, container B (publish) reads the result via a shared volume mounted to both.
Init container pattern (K8s, also doable in Compose with service_completed_successfully): init writes config, app reads it.

Follow-up questions

Q: Can two containers write to the same file simultaneously?

A: Yes, but the OS does not coordinate. Last-write-wins or partial writes are likely. Application-level coordination (locks, atomic writes, append-only logs) is your responsibility.

Q: Are shared volumes safe across host reboots?

A: Yes — volumes are persistent. Reboot the host, restart the containers with the same volume mounts, data is intact.

Q: What is the difference between a shared volume and a bind mount of the same host path?

A: Functionally similar — both let multiple containers see the same files. Volumes are Docker-managed (portable, lifecycle controlled), bind mounts pin to a host path (not portable).

Q: How do I share data between containers on different hosts?

A: A few options. (1) Network service (DB, Redis) accessible from both hosts. (2) Network filesystem (NFS, GlusterFS, EFS) mounted on both hosts and bind-mounted into containers. (3) Object storage (S3, Minio) with both containers as clients.

Q: (Senior) When does sharing a volume become an architectural smell?

A: When two services share state without a clear owner — both write to the same files. That coupling makes scaling them independently impossible: one's bug becomes the other's problem. The smell is followed by the fix: introduce an owning service (DB, queue) that becomes the single source of truth, with both "sharers" as clients. Filesystem-as-IPC is fine for caches and immutable data; for live mutable state, prefer a service.

Examples

Sidecar log shipper pattern

yaml

services:
  app:
    image: myapp
    volumes:
      - applogs:/var/log/app
  fluentd:
    image: fluent/fluentd
    volumes:
      - applogs:/fluentd/log:ro    # read-only — fluentd does not write app logs
      - ./fluent.conf:/fluentd/etc/fluent.conf:ro
    depends_on: [app]
volumes:
  applogs:

App writes logs to a volume, fluentd reads them and ships to an aggregator. Loose coupling, single owner of writes.

Build-and-publish pipeline

yaml

services:
  build:
    image: builder
    volumes:
      - artifact:/out
    command: build-script.sh
    restart: "no"

  publish:
    image: publisher
    volumes:
      - artifact:/in:ro      # publisher does not modify
    depends_on:
      build:
        condition: service_completed_successfully
    restart: "no"

volumes:
  artifact:

Build stage produces files, publish stage consumes them. The volume is the handoff. service_completed_successfully makes the order explicit.

yaml

services:
  web:
    image: nginx:1.27-alpine
    volumes:
      - certs:/etc/letsencrypt:ro
      - ./nginx.conf:/etc/nginx/nginx.conf:ro
    ports: ["80:80", "443:443"]

  certbot:
    image: certbot/certbot
    volumes:
      - certs:/etc/letsencrypt          # certbot writes
      - ./acme-challenge:/var/www/acme-challenge
    entrypoint: |
      sh -c "trap exit TERM; while :; do
        certbot renew --webroot -w /var/www/acme-challenge --quiet
        nginx -s reload
        sleep 12h & wait $${!}; done"

Certbot writes certs to the volume; nginx reads them. Renewal happens periodically; nginx reload picks up new certs without container restart.

Short Answer

Interview ready

Premium

A concise answer to help you respond confidently on this topic during an interview.

Comments

No comments yet