Suggest an editImprove this articleRefine the answer for “What is the cluster module in Node.js?”. Your changes go to moderation before they’re published.Approval requiredContentWhat you’re changing🇺🇸EN🇺🇦UAPreviewTitle (EN)Short answer (EN)**The cluster module** in Node.js spawns multiple worker processes that share the same port, one per CPU core. Without it, Node.js uses a single core regardless of machine size. ```js if (cluster.isPrimary) { for (let i = 0; i < os.cpus().length; i++) cluster.fork(); } else { http.createServer((req, res) => res.end(`pid: ${process.pid}`)).listen(3000); } ``` **Key point:** the OS distributes connections across workers via round-robin, no proxy needed.Shown above the full answer for quick recall.Answer (EN)Image**The cluster module** lets a Node.js app spawn multiple worker processes that share the same TCP port, one per CPU core. By default, a Node.js process uses one core regardless of how many your machine has. Cluster fixes that. ## Theory ### TL;DR - Analogy: one manager (primary) runs the front door, multiple cooks (workers) serve from the same address using separate stoves (CPU cores) - Default Node.js = 1 core; cluster = all cores, no external proxy needed - The OS distributes incoming connections across workers automatically via round-robin - Use it when CPU load stays above ~20% and you have more than 2 cores - Skip it for I/O-heavy apps (DB queries, external APIs) - async handles those without forking ### Quick example ```js const cluster = require('cluster'); const http = require('http'); const os = require('os'); if (cluster.isPrimary) { for (let i = 0; i < os.cpus().length; i++) { cluster.fork(); // one worker per CPU core } cluster.on('exit', () => cluster.fork()); // restart crashed workers } else { // each worker listens on the same port http.createServer((req, res) => { res.end(`Worker ${process.pid}\n`); }).listen(8000); } ``` On a 4-core machine, four separate processes handle port 8000. Run `curl localhost:8000` ten times and you see different PIDs. That's the OS distributing load, nothing else. ### How it works internally When the primary calls `cluster.fork()`, V8 uses the `clone()` syscall to copy the current process. The primary binds the listening socket once with `SO_REUSEPORT` (Linux 3.9+). Workers inherit the file descriptor and block on `accept()` until the kernel hands them a connection. No proxy sits between the client and the worker. The kernel handles the round-robin. IPC between primary and workers runs over Unix domain sockets. That's how signals, exit events, and custom messages via `process.send()` travel between processes. ### When to use cluster - **CPU-bound work (above ~20% load):** forking across cores helps. A loop computing 10^8 iterations blocks one thread; spread it across 4 workers and you handle 4x the requests in parallel. - **I/O-heavy apps:** skip it. Async/await plus the event loop scales DB and API calls without any forking overhead. - **Less than 2 cores:** the process startup cost cancels any gain. Containers are the case I see most often - someone runs cluster inside a Docker container with 1 CPU and wonders why nothing improved. - **Docker or Kubernetes:** orchestrate replicas at the container level instead. Cluster inside a single-core container adds nothing. - **Zero-downtime deploys:** cluster handles this well if you restart workers one at a time. ### Graceful shutdown The most common production problem is dropping in-flight requests when a worker exits. Without `server.close()`, connections get cut mid-response. ```js if (!cluster.isPrimary) { const server = http.createServer(handler).listen(3000); process.on('SIGTERM', () => { server.close(() => process.exit(0)); // drain in-flight requests first setTimeout(() => process.exit(1), 30_000); // force-kill after 30s }); } if (cluster.isPrimary) { process.on('SIGTERM', async () => { for (const id in cluster.workers) { cluster.workers[id].kill('SIGTERM'); } await new Promise(r => setTimeout(r, 30_000)); process.exit(0); }); } ``` Kubernetes sends SIGTERM on pod shutdown. Without this pattern, every rolling deploy drops some requests. ### Common mistakes **1. Hardcoding worker count** ```js // wrong for (let i = 0; i < 4; i++) cluster.fork(); ``` On a 64-core server this under-forks; on a 2-core container it wastes memory. Always use `os.cpus().length`. **2. Sharing state via global variables** ```js // wrong - each worker has its own copy of counter let counter = 0; http.createServer((req, res) => { counter++; res.end(counter.toString()); // each worker returns 1, 2, 3 independently }).listen(3000); ``` Workers are separate processes with separate V8 heaps. A counter in worker 1 never reaches worker 2. Use Redis (`redis.incr('counter')`) or send a message to the primary for shared state. **3. No exit handler** A worker crash on a quad-core cluster silently drops you to 75% capacity. Add `cluster.on('exit', () => cluster.fork())` and the primary always keeps the right number of workers running. **4. Listening in the primary** ```js // wrong - primary handles all traffic, workers sit idle if (cluster.isPrimary) { http.createServer(handler).listen(8000); } ``` Listening belongs in the worker branch. The primary manages lifecycle only. **5. Session state without Redis** `req.session.user` stored in memory lives inside one worker. A second request hitting a different worker finds no session. Fix: use a Redis session store or configure sticky sessions at the proxy level. ### Real-world usage - **PM2:** wraps cluster automatically. `pm2 start app.js -i max` forks one worker per core. Most Node.js production setups use PM2 rather than raw cluster. - **Express APIs:** wrap `app.listen()` in the worker branch; the rest of the app setup stays the same. - **Manual cluster vs PM2:** use raw cluster when you want zero dependencies or need custom restart logic. Use PM2 for monitoring dashboards, log aggregation, and automatic restarts. ### Follow-up questions **Q:** How does load balancing work without a proxy? **A:** The primary binds the port with `SO_REUSEPORT`. The kernel round-robins new connections to workers blocked on `accept()`. No external process is involved. **Q:** What is the difference between cluster and `child_process.fork()`? **A:** `child_process.fork()` creates a subprocess with IPC but no port sharing. Cluster adds socket inheritance so all workers can accept on the same port. **Q:** How do you handle WebSockets with cluster? **A:** WebSockets need sticky sessions because the connection persists across multiple messages. Use a load balancer like HAProxy that routes by source IP, or encode the worker ID in the connection URL. **Q:** Why can cluster hurt performance on I/O-heavy apps? **A:** Fork overhead plus IPC cost adds up. A single async Node.js process handles thousands of concurrent DB queries through the event loop. Forking just adds memory and startup cost with no throughput gain. **Q:** How would you implement zero-downtime deploys manually? **A:** Restart workers one at a time. Send SIGTERM to one worker, wait for it to drain via `server.close()`, then fork a replacement. PM2's `pm2 reload` automates exactly this sequence. **Q:** Round-robin vs random distribution? **A:** Node.js uses round-robin by default on Linux. You can disable it with `CLUSTER_ROUND_ROBIN=false`, but round-robin distributes CPU load more evenly for long-poll connections. Measure with `ab -n 1000 -c 10` if you want to compare both approaches on your hardware. ## Examples ### Basic HTTP server with auto-restart ```js const cluster = require('cluster'); const http = require('http'); const os = require('os'); if (cluster.isPrimary) { console.log(`Primary ${process.pid} starting`); for (let i = 0; i < os.cpus().length; i++) { cluster.fork(); } cluster.on('exit', (worker) => { console.log(`Worker ${worker.process.pid} died, restarting`); cluster.fork(); }); } else { http.createServer((req, res) => { res.writeHead(200); res.end(`Served by worker ${process.pid}\n`); }).listen(8000); console.log(`Worker ${process.pid} listening on 8000`); } ``` This is the minimal production pattern. Primary forks and restarts dead workers, never touches HTTP. Workers handle everything else. ### Express API across all CPU cores ```js const cluster = require('cluster'); const os = require('os'); const express = require('express'); if (cluster.isPrimary) { for (let i = 0; i < os.cpus().length; i++) cluster.fork(); cluster.on('exit', () => cluster.fork()); } else { const app = express(); app.get('/compute/:n', (req, res) => { // CPU-bound work - this is where cluster helps let sum = 0; for (let j = 0; j < 1e8; j++) sum += j; res.json({ n: req.params.n, pid: process.pid, sum }); }); app.listen(3000, () => console.log(`Worker ${process.pid} ready`)); } ``` Ten concurrent requests to `/compute/1` hit different workers on a quad-core machine. Without cluster they queue behind each other on one core. That's the whole point. ### IPC: worker-to-primary messaging ```js if (cluster.isPrimary) { const worker = cluster.fork(); worker.on('message', (msg) => { if (msg.type === 'metrics') { console.log(`Worker ${worker.process.pid} handled ${msg.count} requests`); } }); } else { let count = 0; http.createServer((req, res) => { count++; res.end('ok'); }).listen(3000); // report metrics to primary every 5 seconds setInterval(() => { process.send({ type: 'metrics', count }); }, 5000); } ``` This pattern lets the primary aggregate stats from all workers without shared memory.For the reviewerNote to the moderator (optional)Visible only to the moderator. Helps review go faster.