How to scale services in Docker Compose?

docs.questions.sections.docker~4 min read

Scaling services in Docker Compose runs multiple containers of the same service on one host. It is genuinely useful for parallel work (workers, batch processing) but limited compared to multi-host orchestrators.

Theory

TL;DR

docker compose up --scale <service>=N runs N containers of that service.
deploy.replicas in compose.yaml (Compose v2+) is the declarative equivalent.
All replicas run on the same host — Compose has no scheduling across machines.
Replicas share the project network; DNS for the service name resolves to all replicas (round-robin).
Cannot publish a fixed host port from multiple replicas — port conflict. Use expose: (internal-only) and front with a reverse proxy.
For multi-host, use Swarm (docker stack deploy), Kubernetes, or similar.

Quick example

yaml

# compose.yaml
services:
  worker:
    image: myworker
    deploy:
      replicas: 3
  api:
    image: myapi
    expose:
      - "3000"   # internal only — no fixed host port
    deploy:
      replicas: 5
  web:
    image: nginx
    ports:
      - "80:80"
    # nginx config load-balances to api:3000 (Docker DNS round-robins)

bash

docker compose up -d
# 3 worker replicas + 5 api replicas + 1 web reverse proxy.
docker compose ps
#   myapp-worker-1, myapp-worker-2, myapp-worker-3
#   myapp-api-1 ... myapp-api-5
#   myapp-web-1

Three services, one stack, scaled differently.

Imperative scaling at runtime

bash

# Increase to 10 api replicas without restarting other services
docker compose up -d --scale api=10

# Decrease back
docker compose up -d --scale api=3

# Multiple services at once
docker compose up -d --scale api=5 --scale worker=10

The --scale flag overrides deploy.replicas for that run. Compose stops or starts containers to match the requested count.

How DNS resolves multiple replicas

Docker's embedded DNS returns ALL IPs for a service name. Many clients (HTTP libraries, language stdlib) round-robin over the returned A records:

bash

$ docker compose exec web nslookup api
Name: api
Address 1: 172.18.0.5
Address 2: 172.18.0.6
Address 3: 172.18.0.7
# Three replicas; clients round-robin among them.

Whether your client actually distributes load depends on its DNS-resolution behavior. Most modern HTTP clients do.

Why fixed ports break with replicas

yaml

services:
  api:
    image: myapi
    ports:
      - "3000:3000"     # ← problem when scaled
    deploy:
      replicas: 3

bash

$ docker compose up -d
ERROR: only one of api can be running at port 3000

Two containers cannot bind to the same host port. Solutions:

Random host ports (ports: ["3000"] without : mapping) — Docker picks free ports.
expose: only — internal access only; reverse proxy handles external.
Range mapping — ports: ["3000-3009:3000"] allocates a range.

The production answer is almost always #2: a reverse proxy in front, replicas internal.

Reverse proxy pattern

yaml

services:
  api:
    image: myapi
    expose: ["3000"]
    deploy: { replicas: 5 }
  web:
    image: nginx:1.27-alpine
    ports: ["80:80"]
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf:ro
    depends_on: [api]

# nginx.conf
upstream api_backend {
    server api:3000;     # Docker DNS resolves to all replica IPs
}
server {
    listen 80;
    location / {
        proxy_pass http://api_backend;
    }
}

The upstream api:3000 lookup hits Docker DNS, gets multiple IPs back, nginx load-balances. Replicas can scale up/down freely.

Scaling with Traefik (auto-discovery)

yaml

services:
  api:
    image: myapi
    expose: ["3000"]
    deploy: { replicas: 5 }
    labels:
      - "traefik.http.routers.api.rule=Host(`api.local`)"
      - "traefik.http.services.api.loadbalancer.server.port=3000"
  traefik:
    image: traefik:v3
    ports: ["80:80", "8080:8080"]
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro

Traefik watches Docker events and auto-updates its routes when replicas come and go. Cleaner than nginx for dynamic scaling.

What Compose scaling does NOT do

Health-aware load balancing. Docker DNS returns all replicas' IPs whether they are healthy or not.
Multi-host scheduling. All replicas on one machine. CPU/memory of that host is the ceiling.
Auto-scaling. No min: 3, max: 20, target_cpu: 70%. You set the count manually.
Rolling updates with controlled parallelism. docker compose up recreates containers in parallel.

For any of these, you need Swarm or Kubernetes.

Common mistakes

Trying to scale a service with a published port

Covered above. The fix is expose: + reverse proxy.

Scaling stateful services

yaml

services:
  db:
    image: postgres:16
    deploy: { replicas: 3 }    # ← BAD

Three postgres containers on one host = three independent databases trying to use the same volume = corruption. Stateful services are not scaled this way; replication and HA are different concerns.

Scaling worker services without queue idempotency

If workers process messages from a queue, multiple workers in parallel is the win. But the workers must be idempotent (a message processed twice has no harmful effect). Otherwise scaling = bugs.

Forgetting that --scale is per-run

bash

docker compose up -d --scale api=5
# api at 5 replicas

docker compose up -d           # next run, no --scale
# api back to deploy.replicas count (default 1 if not set)

--scale does not persist. For permanent count, set deploy.replicas in the YAML.

Real-world usage

Worker pools (image processing, background jobs): scale worker to N, queue feeds tasks, idempotent processing.
Stateless API behind a proxy: api replicas + nginx/Traefik for load balancing. Easy horizontal scale on one host.
CI test parallelism: scale a runner service to match CPU cores for parallel test execution.
Migrations to Swarm/K8s: Compose deploy.replicas is the same key Swarm uses, so the syntax carries over when you graduate.

Follow-up questions

Q: What is the difference between docker compose up --scale and docker service scale (Swarm)?

A: compose --scale is single-host. Swarm service scale distributes across cluster nodes. Same idea, different scope.

Q: How do replicas share state?

A: They do not, by default. Each replica gets its own writable layer. For shared state, point all replicas at the same external service (DB, Redis) or use a named volume.

Q: Is Compose scaling production-ready?

A: For single-host workloads with appropriate front-ending (reverse proxy), yes. For HA, no — one host failure takes down all replicas. Multi-host means Swarm or K8s.

Q: Can I scale to zero?

A: Yes: docker compose up -d --scale api=0. Stops all api containers. Useful for temporarily disabling a service.

Q: (Senior) How would you design a single-host Compose stack for a workload that occasionally bursts to 10x normal load?

A: Three options. (1) Pre-scale to peak (waste during normal load). (2) Manual scale-up at known burst times via cron-triggered docker compose up --scale. (3) Move to Swarm or K8s with HPA. For single-host, option 2 (cron-driven) is the realistic answer; for genuine elasticity, you have to leave Compose.

Examples

Worker pool with auto load balancing via Traefik

yaml

services:
  api:
    image: myorg/api:1.0
    expose: ["3000"]
    deploy:
      replicas: 4
    labels:
      - "traefik.http.routers.api.rule=Host(`api.local`)"
      - "traefik.http.services.api.loadbalancer.server.port=3000"
      - "traefik.http.services.api.loadbalancer.healthcheck.path=/health"

  traefik:
    image: traefik:v3
    command:
      - --api.insecure=true
      - --providers.docker
      - --entrypoints.web.address=:80
    ports:
      - "80:80"
      - "8080:8080"   # Traefik dashboard
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro

bash

$ docker compose up -d
$ curl -H 'Host: api.local' http://localhost/
# Round-robined across 4 api replicas; Traefik checks /health

$ docker compose up -d --scale api=8
# Traefik picks up the new replicas automatically

Background workers reading from a queue

yaml

services:
  redis:
    image: redis:7
  worker:
    image: myorg/worker:1.0
    deploy:
      replicas: 5
    environment:
      REDIS_URL: redis://redis:6379
    depends_on: [redis]

Five worker containers connect to the same Redis queue. Each pulls jobs independently. Linear speedup as you add workers, up to the queue's throughput.

Scaling at runtime

bash

# Normal load
$ docker compose up -d
# 1 api by default

# Black Friday
$ docker compose up -d --scale api=20
# 20 api replicas now running

# After the rush
$ docker compose up -d --scale api=3
# Back to 3

No restart of unrelated services. Compose only adjusts what changed.

Short Answer

Interview ready

Premium

A concise answer to help you respond confidently on this topic during an interview.

Comments

No comments yet