Suggest an editImprove this articleRefine the answer for “What is a Docker overlay network and when do you need it?”. Your changes go to moderation before they’re published.Approval requiredContentWhat you’re changing🇺🇸EN🇺🇦UAPreviewTitle (EN)Short answer (EN)**A Docker overlay network** lets containers running on different hosts communicate as if on the same LAN. It encapsulates traffic in VXLAN packets and routes them across the underlay network. Required for Docker Swarm services that span multiple nodes. ```bash # On a Swarm manager docker network create --driver overlay --attachable myoverlay docker service create --name api --network myoverlay myapp ``` **Key:** overlay = multi-host. If you only have one Docker host, you do not need it — bridge is enough. If you run a Swarm cluster, every cross-node service connection goes through an overlay network.Shown above the full answer for quick recall.Answer (EN)Image**A Docker overlay network** is the answer to "how do containers on host A talk to containers on host B?". It builds a virtual L2 network on top of the existing IP network using VXLAN encapsulation, so containers see each other as if on the same physical LAN regardless of which host they actually run on. ## Theory ### TL;DR - Overlay = a network that spans **multiple Docker hosts** (Swarm nodes). - Built on **VXLAN** (Virtual eXtensible LAN, RFC 7348). Each container packet is wrapped in a UDP packet and shipped over the underlay network. - Requires Docker Swarm — `docker swarm init` first. Without Swarm, `docker network create --driver overlay` errors out. - Encrypted with `--opt encrypted` on creation; uses IPsec/AES-GCM in modern Swarm. - Use when: Swarm services need cross-host service discovery and load balancing. - Skip when: single host (bridge is enough), or Kubernetes (it has its own CNI plugins, not Docker overlay). ### Quick example ```bash # Initialize Swarm on the first node node-1$ docker swarm init --advertise-addr 192.168.1.10 # Outputs a join token; run on each worker: node-2$ docker swarm join --token SWMTKN-... 192.168.1.10:2377 # On manager, create an overlay network node-1$ docker network create --driver overlay --attachable appnet # Deploy a service across nodes node-1$ docker service create --name api --network appnet --replicas 5 myapp # 5 containers spread across both nodes; all on appnet, all reach each other. # Containers on node-1 and node-2 can ping each other by service name node-1$ docker exec <api-container> ping api # round-robins via Swarm DNS ``` Without overlay, the api container on node-1 has no IP route to a container on node-2. With overlay, traffic gets VXLAN-wrapped and tunneled. ### How VXLAN encapsulation works ``` container A (10.0.0.5) on node-1 container B (10.0.0.6) on node-2 | | | packet: src=10.0.0.5 dst=10.0.0.6 | v ^ +-----------------+ +-----------------+ | overlay vxlan | | overlay vxlan | | wraps in UDP/4789| underlay: 192.168.1.10/.20 | unwraps | +-----------------+ ────────────────────────────►+-----------------+ | ^ | on the wire: UDP 4789, payload = original packet| v | host eth0 (192.168.1.10) ──────────────────────────► host eth0 (192.168.1.20) ``` The overlay network has its own IP space (e.g., 10.0.0.0/24). Each container gets an IP there. The host has a separate IP on the underlay (192.168.1.0/24). Container packets are wrapped in UDP packets to the destination host's underlay IP. ### Service discovery on overlay Overlay networks include Docker Swarm's embedded DNS plus a built-in load balancer (the "routing mesh"): ```bash # Service "api" has 5 replicas across 3 nodes. Inside any container on appnet: $ nslookup api Server: 127.0.0.11 Address 1: 10.0.1.5 api # A virtual IP that load-balances across the 5 replicas via IPVS. ``` The service name resolves to a virtual IP. Connections to that VIP are distributed among healthy replicas by the kernel's IPVS module on each node. Containers do not need to know which physical host runs which replica. ### Encrypted overlays ```bash docker network create --driver overlay --opt encrypted my-secure-net ``` Adds AES-GCM encryption between Swarm nodes. Useful when the underlay network is untrusted (cross-region, public internet between nodes). Performance cost is real but usually acceptable. ### When you need overlay - **Docker Swarm services** running on more than one node and needing to talk to each other. - **Service-to-service traffic** that should not transit the public internet (use overlay across VPN-linked datacenters). - **Anywhere a single bridge cannot reach** because containers live on different hosts. ### When you do NOT need overlay - **Single Docker host:** bridge is simpler and faster. - **Kubernetes:** uses CNI plugins (Calico, Flannel, Cilium) — not Docker overlay. K8s has its own equivalent at the cluster level. - **External communication:** containers reach the internet via NAT regardless of network type. Overlay is for container-to-container, not container-to-internet. ### Common mistakes **Trying overlay without Swarm** ```bash $ docker network create --driver overlay test Error response from daemon: This node is not a swarm manager. ``` Overlay is a Swarm feature. `docker swarm init` first. **Forgetting `--attachable`** ```bash # Without --attachable, only Swarm services can join. # Standalone `docker run --network myoverlay` fails. docker network create --driver overlay myoverlay # With --attachable, both services and standalone containers can attach. docker network create --driver overlay --attachable myoverlay ``` For mixed workflows (service + ad-hoc debug containers), use `--attachable`. **Underlay firewall blocking VXLAN** VXLAN uses **UDP port 4789**. Swarm also needs **TCP/UDP 7946** (gossip) and **TCP 2377** (manager API). Cloud security groups or on-prem firewalls between nodes must allow these. **Picking overlay for a single-host setup** No benefit, real overhead (VXLAN encap, even when both endpoints are local). Stick with bridge until you actually go multi-host. ### Real-world usage - **Swarm production:** every cross-host service connection. Swarm builds an overlay network called `ingress` automatically for the routing mesh. - **Multi-region Swarm:** encrypted overlay across cloud regions, VXLAN tunneling over WAN. Works but be honest about latency. - **Hybrid demos / lab setups:** overlay across a couple of laptops at a workshop to simulate multi-host without K8s. ### Follow-up questions **Q:** What is the difference between overlay and bridge? **A:** Bridge is single-host (all containers on one Docker daemon). Overlay is multi-host (containers can be on different machines). Overlay carries the L2 abstraction across the IP underlay; bridge is just a Linux bridge on one host. **Q:** Does overlay work with regular `docker run` containers, or only services? **A:** Both, if you create the overlay with `--attachable`. Without that flag, only Swarm services can use it. **Q:** Why not always use overlay even on single host? **A:** Performance overhead (VXLAN encapsulation). Complexity (requires Swarm running). For single host, bridge gives the same container DNS without the overhead. **Q:** How is Docker overlay different from Kubernetes pod networking? **A:** Both solve cross-host container-to-container connectivity. Docker overlay is built into Swarm and uses VXLAN. Kubernetes uses CNI plugins, of which there are many (some VXLAN-based like Flannel, others routing-based like Calico). Different ecosystems, similar problem. **Q:** (Senior) What is the routing mesh in Docker Swarm and how does it relate to overlay? **A:** The routing mesh is a special overlay called `ingress` that Swarm creates automatically. When you publish a service port (`-p 80:80` on `docker service create`), every Swarm node listens on host:80, and traffic that arrives is forwarded over the overlay to a node actually running a replica. Combined with IPVS load balancing inside each node, you get any-node-can-receive-traffic behavior without external load balancers. Caveat: routing-mesh DNAT can break source IP visibility — use `--publish mode=host` if you need real client IPs. ## Examples ### Setting up a two-node Swarm with overlay ```bash # Node 1 (manager) node-1$ docker swarm init --advertise-addr 192.168.1.10 Swarm initialized: current node (xxx) is now a manager. To add a worker: docker swarm join --token SWMTKN-... 192.168.1.10:2377 # Node 2 (worker) node-2$ docker swarm join --token SWMTKN-... 192.168.1.10:2377 # Verify node-1$ docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS xxx node-1 Ready Active Leader yyy node-2 Ready Active # Create overlay node-1$ docker network create --driver overlay --attachable appnet # Deploy a service spread across both nodes node-1$ docker service create --name api --replicas 4 --network appnet myapp # Confirm placement node-1$ docker service ps api # 4 replicas, 2 on each node ``` ### Encrypted overlay across regions ```bash # On a Swarm where nodes live in different cloud regions $ docker network create --driver overlay \ --opt encrypted \ --subnet 10.5.0.0/16 \ secure-net $ docker service create --name api --replicas 6 --network secure-net myapp # Service traffic between us-east-1 and eu-west-1 is now AES-encrypted in transit. ``` The encryption flag adds IPsec ESP between hosts. CPU cost: noticeable for high-throughput services; usually fine for normal RPC traffic.For the reviewerNote to the moderator (optional)Visible only to the moderator. Helps review go faster.