Suggest an edit

Improve this article

Refine the answer for “What is the cluster module in Node.js?”. Your changes go to moderation before they’re published.

Approval required

Content

What you’re changing

Title (EN)

Short answer (EN)

Shown above the full answer for quick recall.

Answer (EN)

**The cluster module** lets a Node.js app spawn multiple worker processes that share the same TCP port, one per CPU core. By default, a Node.js process uses one core regardless of how many your machine has. Cluster fixes that.

## Theory

### TL;DR

- Analogy: one manager (primary) runs the front door, multiple cooks (workers) serve from the same address using separate stoves (CPU cores)
- Default Node.js = 1 core; cluster = all cores, no external proxy needed
- The OS distributes incoming connections across workers automatically via round-robin
- Use it when CPU load stays above ~20% and you have more than 2 cores
- Skip it for I/O-heavy apps (DB queries, external APIs) - async handles those without forking

### Quick example

```js
const cluster = require('cluster');
const http = require('http');
const os = require('os');

if (cluster.isPrimary) {
  for (let i = 0; i < os.cpus().length; i++) {
    cluster.fork(); // one worker per CPU core
  }
  cluster.on('exit', () => cluster.fork()); // restart crashed workers
} else {
  // each worker listens on the same port
  http.createServer((req, res) => {
    res.end(`Worker ${process.pid}\n`);
  }).listen(8000);
}
```

On a 4-core machine, four separate processes handle port 8000. Run `curl localhost:8000` ten times and you see different PIDs. That's the OS distributing load, nothing else.

### How it works internally

When the primary calls `cluster.fork()`, V8 uses the `clone()` syscall to copy the current process. The primary binds the listening socket once with `SO_REUSEPORT` (Linux 3.9+). Workers inherit the file descriptor and block on `accept()` until the kernel hands them a connection. No proxy sits between the client and the worker. The kernel handles the round-robin.

IPC between primary and workers runs over Unix domain sockets. That's how signals, exit events, and custom messages via `process.send()` travel between processes.

### When to use cluster

- **CPU-bound work (above ~20% load):** forking across cores helps. A loop computing 10^8 iterations blocks one thread; spread it across 4 workers and you handle 4x the requests in parallel.
- **I/O-heavy apps:** skip it. Async/await plus the event loop scales DB and API calls without any forking overhead.
- **Less than 2 cores:** the process startup cost cancels any gain. Containers are the case I see most often - someone runs cluster inside a Docker container with 1 CPU and wonders why nothing improved.
- **Docker or Kubernetes:** orchestrate replicas at the container level instead. Cluster inside a single-core container adds nothing.
- **Zero-downtime deploys:** cluster handles this well if you restart workers one at a time.

### Graceful shutdown

The most common production problem is dropping in-flight requests when a worker exits. Without `server.close()`, connections get cut mid-response.

```js
if (!cluster.isPrimary) {
  const server = http.createServer(handler).listen(3000);

process.on('SIGTERM', () => {
    server.close(() => process.exit(0)); // drain in-flight requests first
    setTimeout(() => process.exit(1), 30_000); // force-kill after 30s
  });
}

if (cluster.isPrimary) {
  process.on('SIGTERM', async () => {
    for (const id in cluster.workers) {
      cluster.workers[id].kill('SIGTERM');
    }
    await new Promise(r => setTimeout(r, 30_000));
    process.exit(0);
  });
}
```

Kubernetes sends SIGTERM on pod shutdown. Without this pattern, every rolling deploy drops some requests.

### Common mistakes

**1. Hardcoding worker count**

```js
// wrong
for (let i = 0; i < 4; i++) cluster.fork();
```

On a 64-core server this under-forks; on a 2-core container it wastes memory. Always use `os.cpus().length`.

**2. Sharing state via global variables**

```js
// wrong - each worker has its own copy of counter
let counter = 0;
http.createServer((req, res) => {
  counter++;
  res.end(counter.toString()); // each worker returns 1, 2, 3 independently
}).listen(3000);
```

Workers are separate processes with separate V8 heaps. A counter in worker 1 never reaches worker 2. Use Redis (`redis.incr('counter')`) or send a message to the primary for shared state.

**3. No exit handler**

A worker crash on a quad-core cluster silently drops you to 75% capacity. Add `cluster.on('exit', () => cluster.fork())` and the primary always keeps the right number of workers running.

**4. Listening in the primary**

```js
// wrong - primary handles all traffic, workers sit idle
if (cluster.isPrimary) {
  http.createServer(handler).listen(8000);
}
```

Listening belongs in the worker branch. The primary manages lifecycle only.

**5. Session state without Redis**

`req.session.user` stored in memory lives inside one worker. A second request hitting a different worker finds no session. Fix: use a Redis session store or configure sticky sessions at the proxy level.

### Real-world usage

- **PM2:** wraps cluster automatically. `pm2 start app.js -i max` forks one worker per core. Most Node.js production setups use PM2 rather than raw cluster.
- **Express APIs:** wrap `app.listen()` in the worker branch; the rest of the app setup stays the same.
- **Manual cluster vs PM2:** use raw cluster when you want zero dependencies or need custom restart logic. Use PM2 for monitoring dashboards, log aggregation, and automatic restarts.

### Follow-up questions

**Q:** How does load balancing work without a proxy?
**A:** The primary binds the port with `SO_REUSEPORT`. The kernel round-robins new connections to workers blocked on `accept()`. No external process is involved.

**Q:** What is the difference between cluster and `child_process.fork()`?
**A:** `child_process.fork()` creates a subprocess with IPC but no port sharing. Cluster adds socket inheritance so all workers can accept on the same port.

**Q:** How do you handle WebSockets with cluster?
**A:** WebSockets need sticky sessions because the connection persists across multiple messages. Use a load balancer like HAProxy that routes by source IP, or encode the worker ID in the connection URL.

**Q:** Why can cluster hurt performance on I/O-heavy apps?
**A:** Fork overhead plus IPC cost adds up. A single async Node.js process handles thousands of concurrent DB queries through the event loop. Forking just adds memory and startup cost with no throughput gain.

**Q:** How would you implement zero-downtime deploys manually?
**A:** Restart workers one at a time. Send SIGTERM to one worker, wait for it to drain via `server.close()`, then fork a replacement. PM2's `pm2 reload` automates exactly this sequence.

**Q:** Round-robin vs random distribution?
**A:** Node.js uses round-robin by default on Linux. You can disable it with `CLUSTER_ROUND_ROBIN=false`, but round-robin distributes CPU load more evenly for long-poll connections. Measure with `ab -n 1000 -c 10` if you want to compare both approaches on your hardware.

## Examples

### Basic HTTP server with auto-restart

```js
const cluster = require('cluster');
const http = require('http');
const os = require('os');

if (cluster.isPrimary) {
  console.log(`Primary ${process.pid} starting`);

for (let i = 0; i < os.cpus().length; i++) {
    cluster.fork();
  }

cluster.on('exit', (worker) => {
    console.log(`Worker ${worker.process.pid} died, restarting`);
    cluster.fork();
  });
} else {
  http.createServer((req, res) => {
    res.writeHead(200);
    res.end(`Served by worker ${process.pid}\n`);
  }).listen(8000);

console.log(`Worker ${process.pid} listening on 8000`);
}
```

This is the minimal production pattern. Primary forks and restarts dead workers, never touches HTTP. Workers handle everything else.

### Express API across all CPU cores

```js
const cluster = require('cluster');
const os = require('os');
const express = require('express');

if (cluster.isPrimary) {
  for (let i = 0; i < os.cpus().length; i++) cluster.fork();
  cluster.on('exit', () => cluster.fork());
} else {
  const app = express();

app.get('/compute/:n', (req, res) => {
    // CPU-bound work - this is where cluster helps
    let sum = 0;
    for (let j = 0; j < 1e8; j++) sum += j;
    res.json({ n: req.params.n, pid: process.pid, sum });
  });

app.listen(3000, () => console.log(`Worker ${process.pid} ready`));
}
```

Ten concurrent requests to `/compute/1` hit different workers on a quad-core machine. Without cluster they queue behind each other on one core. That's the whole point.

### IPC: worker-to-primary messaging

```js
if (cluster.isPrimary) {
  const worker = cluster.fork();

worker.on('message', (msg) => {
    if (msg.type === 'metrics') {
      console.log(`Worker ${worker.process.pid} handled ${msg.count} requests`);
    }
  });
} else {
  let count = 0;

http.createServer((req, res) => {
    count++;
    res.end('ok');
  }).listen(3000);

// report metrics to primary every 5 seconds
  setInterval(() => {
    process.send({ type: 'metrics', count });
  }, 5000);
}
```

This pattern lets the primary aggregate stats from all workers without shared memory.

Markdown · drag & drop images · ⌘B / ⌘I shortcuts1311 words

For the reviewer

Note to the moderator (optional)

Visible only to the moderator. Helps review go faster.