Skip to main content

What is a Docker overlay network and when do you need it?

A Docker overlay network is the answer to "how do containers on host A talk to containers on host B?". It builds a virtual L2 network on top of the existing IP network using VXLAN encapsulation, so containers see each other as if on the same physical LAN regardless of which host they actually run on.

Theory

TL;DR

  • Overlay = a network that spans multiple Docker hosts (Swarm nodes).
  • Built on VXLAN (Virtual eXtensible LAN, RFC 7348). Each container packet is wrapped in a UDP packet and shipped over the underlay network.
  • Requires Docker Swarm — docker swarm init first. Without Swarm, docker network create --driver overlay errors out.
  • Encrypted with --opt encrypted on creation; uses IPsec/AES-GCM in modern Swarm.
  • Use when: Swarm services need cross-host service discovery and load balancing.
  • Skip when: single host (bridge is enough), or Kubernetes (it has its own CNI plugins, not Docker overlay).

Quick example

bash
# Initialize Swarm on the first node node-1$ docker swarm init --advertise-addr 192.168.1.10 # Outputs a join token; run on each worker: node-2$ docker swarm join --token SWMTKN-... 192.168.1.10:2377 # On manager, create an overlay network node-1$ docker network create --driver overlay --attachable appnet # Deploy a service across nodes node-1$ docker service create --name api --network appnet --replicas 5 myapp # 5 containers spread across both nodes; all on appnet, all reach each other. # Containers on node-1 and node-2 can ping each other by service name node-1$ docker exec <api-container> ping api # round-robins via Swarm DNS

Without overlay, the api container on node-1 has no IP route to a container on node-2. With overlay, traffic gets VXLAN-wrapped and tunneled.

How VXLAN encapsulation works

container A (10.0.0.5) on node-1 container B (10.0.0.6) on node-2 | | | packet: src=10.0.0.5 dst=10.0.0.6 | v ^ +-----------------+ +-----------------+ | overlay vxlan | | overlay vxlan | | wraps in UDP/4789| underlay: 192.168.1.10/.20 | unwraps | +-----------------+ ────────────────────────────►+-----------------+ | ^ | on the wire: UDP 4789, payload = original packet| v | host eth0 (192.168.1.10) ──────────────────────────► host eth0 (192.168.1.20)

The overlay network has its own IP space (e.g., 10.0.0.0/24). Each container gets an IP there. The host has a separate IP on the underlay (192.168.1.0/24). Container packets are wrapped in UDP packets to the destination host's underlay IP.

Service discovery on overlay

Overlay networks include Docker Swarm's embedded DNS plus a built-in load balancer (the "routing mesh"):

bash
# Service "api" has 5 replicas across 3 nodes. Inside any container on appnet: $ nslookup api Server: 127.0.0.11 Address 1: 10.0.1.5 api # A virtual IP that load-balances across the 5 replicas via IPVS.

The service name resolves to a virtual IP. Connections to that VIP are distributed among healthy replicas by the kernel's IPVS module on each node. Containers do not need to know which physical host runs which replica.

Encrypted overlays

bash
docker network create --driver overlay --opt encrypted my-secure-net

Adds AES-GCM encryption between Swarm nodes. Useful when the underlay network is untrusted (cross-region, public internet between nodes). Performance cost is real but usually acceptable.

When you need overlay

  • Docker Swarm services running on more than one node and needing to talk to each other.
  • Service-to-service traffic that should not transit the public internet (use overlay across VPN-linked datacenters).
  • Anywhere a single bridge cannot reach because containers live on different hosts.

When you do NOT need overlay

  • Single Docker host: bridge is simpler and faster.
  • Kubernetes: uses CNI plugins (Calico, Flannel, Cilium) — not Docker overlay. K8s has its own equivalent at the cluster level.
  • External communication: containers reach the internet via NAT regardless of network type. Overlay is for container-to-container, not container-to-internet.

Common mistakes

Trying overlay without Swarm

bash
$ docker network create --driver overlay test Error response from daemon: This node is not a swarm manager.

Overlay is a Swarm feature. docker swarm init first.

Forgetting --attachable

bash
# Without --attachable, only Swarm services can join. # Standalone `docker run --network myoverlay` fails. docker network create --driver overlay myoverlay # With --attachable, both services and standalone containers can attach. docker network create --driver overlay --attachable myoverlay

For mixed workflows (service + ad-hoc debug containers), use --attachable.

Underlay firewall blocking VXLAN

VXLAN uses UDP port 4789. Swarm also needs TCP/UDP 7946 (gossip) and TCP 2377 (manager API). Cloud security groups or on-prem firewalls between nodes must allow these.

Picking overlay for a single-host setup

No benefit, real overhead (VXLAN encap, even when both endpoints are local). Stick with bridge until you actually go multi-host.

Real-world usage

  • Swarm production: every cross-host service connection. Swarm builds an overlay network called ingress automatically for the routing mesh.
  • Multi-region Swarm: encrypted overlay across cloud regions, VXLAN tunneling over WAN. Works but be honest about latency.
  • Hybrid demos / lab setups: overlay across a couple of laptops at a workshop to simulate multi-host without K8s.

Follow-up questions

Q: What is the difference between overlay and bridge?


A: Bridge is single-host (all containers on one Docker daemon). Overlay is multi-host (containers can be on different machines). Overlay carries the L2 abstraction across the IP underlay; bridge is just a Linux bridge on one host.

Q: Does overlay work with regular docker run containers, or only services?


A: Both, if you create the overlay with --attachable. Without that flag, only Swarm services can use it.

Q: Why not always use overlay even on single host?


A: Performance overhead (VXLAN encapsulation). Complexity (requires Swarm running). For single host, bridge gives the same container DNS without the overhead.

Q: How is Docker overlay different from Kubernetes pod networking?


A: Both solve cross-host container-to-container connectivity. Docker overlay is built into Swarm and uses VXLAN. Kubernetes uses CNI plugins, of which there are many (some VXLAN-based like Flannel, others routing-based like Calico). Different ecosystems, similar problem.

Q: (Senior) What is the routing mesh in Docker Swarm and how does it relate to overlay?


A: The routing mesh is a special overlay called ingress that Swarm creates automatically. When you publish a service port (-p 80:80 on docker service create), every Swarm node listens on host:80, and traffic that arrives is forwarded over the overlay to a node actually running a replica. Combined with IPVS load balancing inside each node, you get any-node-can-receive-traffic behavior without external load balancers. Caveat: routing-mesh DNAT can break source IP visibility — use --publish mode=host if you need real client IPs.

Examples

Setting up a two-node Swarm with overlay

bash
# Node 1 (manager) node-1$ docker swarm init --advertise-addr 192.168.1.10 Swarm initialized: current node (xxx) is now a manager. To add a worker: docker swarm join --token SWMTKN-... 192.168.1.10:2377 # Node 2 (worker) node-2$ docker swarm join --token SWMTKN-... 192.168.1.10:2377 # Verify node-1$ docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS xxx node-1 Ready Active Leader yyy node-2 Ready Active # Create overlay node-1$ docker network create --driver overlay --attachable appnet # Deploy a service spread across both nodes node-1$ docker service create --name api --replicas 4 --network appnet myapp # Confirm placement node-1$ docker service ps api # 4 replicas, 2 on each node

Encrypted overlay across regions

bash
# On a Swarm where nodes live in different cloud regions $ docker network create --driver overlay \ --opt encrypted \ --subnet 10.5.0.0/16 \ secure-net $ docker service create --name api --replicas 6 --network secure-net myapp # Service traffic between us-east-1 and eu-west-1 is now AES-encrypted in transit.

The encryption flag adds IPsec ESP between hosts. CPU cost: noticeable for high-throughput services; usually fine for normal RPC traffic.

Short Answer

Interview ready
Premium

A concise answer to help you respond confidently on this topic during an interview.

Comments

No comments yet