Suggest an editImprove this articleRefine the answer for “What are child processes in Node.js?”. Your changes go to moderation before they’re published.Approval requiredContentWhat you’re changing🇺🇸EN🇺🇦UAPreviewTitle (EN)Short answer (EN)**Child processes** in Node.js use the `child_process` module to spawn separate OS processes, freeing the event loop from blocking work like shell commands or CPU-heavy tasks. ```js const { spawn } = require('child_process'); const child = spawn('ls', ['-la']); child.stdout.on('data', (data) => console.log(`${data}`)); child.on('close', (code) => console.log(`Exit: ${code}`)); ``` **Key:** Use `exec()` for small shell commands, `spawn()` for streaming large output, `fork()` for Node-to-Node IPC messaging.Shown above the full answer for quick recall.Answer (EN)Image**Child processes** let Node.js spawn separate OS processes to run shell commands, execute other programs, or offload CPU-heavy work outside the single-threaded event loop. ## Theory ### TL;DR - Think of child processes as separate workers: each has its own memory, CPU time, and event loop, while the main process stays free - Four methods: `exec()` for shell commands with buffered output, `execFile()` for direct binary execution, `spawn()` for streaming large data, `fork()` for Node-to-Node IPC messaging - The key split: `exec()` and `execFile()` collect all output before the callback fires; `spawn()` and `fork()` stream it - Decision rule: large or streaming data goes to `spawn()`, simple commands go to `exec()`, two Node.js processes talking go to `fork()` - Unlike worker threads, child processes have isolated memory and their own V8 instances ### Quick example ```js const { spawn } = require('child_process'); // Streams output in real-time without loading it all into memory const child = spawn('ls', ['-la']); child.stdout.on('data', (data) => { console.log(`Output: ${data}`); }); child.on('close', (code) => { console.log(`Process exited with code ${code}`); }); // Event loop stays free while child runs console.log('Main thread not blocked'); ``` `spawn()` returns a child process object where `stdout`, `stderr`, and `stdin` are streams. Data arrives as it is produced, not all at once after the process finishes. ### Key difference The `child_process` module breaks Node.js out of its single-threaded model by creating actual OS processes. Unlike worker threads, which share the same V8 heap, each child process gets its own V8 instance, memory heap, and event loop. Communication happens through IPC channels (for `fork()`) or stdin/stdout pipes. Objects passed via `send()` are serialized and deserialized. They are never shared by reference. ### When to use - **`exec()`**: shell commands with small output (under roughly 1MB). Running `git status`, `npm list`, or any one-liner shell expression. Accepts a shell string, so pipes and redirects work. - **`execFile()`**: executing a binary or script directly without involving a shell. Safer than `exec()` because it does not parse shell metacharacters. Good for compiled Go binaries, Python scripts, or anything where user input touches the arguments. - **`spawn()`**: large output, real-time data, or long-running processes. `ffmpeg` video conversion, filtering log files, running build tools. Data flows as it arrives. - **`fork()`**: running Node.js code in a separate process with bidirectional messaging. CPU-heavy calculations, worker pools, or any case where you want to keep the main process responsive during serious compute work. ### Comparison table | Method | Shell? | Buffering | Output size | IPC | Best for | |--------|--------|-----------|-------------|-----|----------| | `exec()` | Yes | Full buffer | Small (<1MB) | No | Simple shell commands | | `execFile()` | No | Full buffer | Small (<1MB) | No | Direct execution, safer | | `spawn()` | No | Streaming | Unlimited | No | Large or real-time data | | `fork()` | No | Streaming | Unlimited | Yes | Node-to-Node communication | ### How it works internally When you call `spawn()`, Node.js uses the OS system call `fork()` on Unix/macOS or `CreateProcess()` on Windows to create a new process. The parent gets three file descriptors connected to the child: stdin, stdout, and stderr. For `fork()` specifically, Node.js also opens an IPC channel using Unix domain sockets or named pipes, which is what enables `child.send()` and `process.on('message')`. Each child process boots its own V8 instance. That is why `fork()` carries roughly 30MB of overhead per child compared to about 2MB for a worker thread. ### Common mistakes **Using `exec()` for large output:** ```js // Wrong: buffers entire output, throws ERR_CHILD_PROCESS_STDIO_MAXBUFFER exec('cat huge-file.txt', (error, stdout) => { console.log(stdout); // whole file sits in memory }); // Right: stream it spawn('cat', ['huge-file.txt']).stdout.pipe(process.stdout); ``` Default `maxBuffer` is 1MB. You can raise it with `{ maxBuffer: 10 * 1024 * 1024 }`, but switching to `spawn()` is the cleaner fix for genuinely large data. **Ignoring error and exit events:** ```js // Wrong: child crashes without notice, parent keeps running with broken state const child = spawn('some-command'); child.stdout.on('data', (data) => console.log(data)); // Right: handle both events child.on('error', (err) => console.error('Failed to start:', err)); child.on('exit', (code, signal) => { if (code !== 0) console.error(`Exited with code ${code}`); }); ``` **Shell injection via `exec()`:** ```js // Wrong: userId = "123; rm -rf /" becomes a real shell command const userId = req.query.id; exec(`grep ${userId} /etc/passwd`, callback); // Right: execFile skips the shell entirely const { execFile } = require('child_process'); execFile('grep', [userId, '/etc/passwd'], callback); ``` `execFile()` passes arguments as an array directly to the OS, so shell metacharacters are never interpreted. **Assuming `fork()` shares memory with the parent:** ```js // Wrong assumption: modifying data in the child does not update the parent const sharedData = { count: 0 }; child.send(sharedData); // child modifies count, parent sees nothing // Right: return updated state through message passing child.on('message', (updatedData) => { console.log('Parent received:', updatedData); }); ``` Objects are serialized with `JSON.stringify` when passed through IPC. The child gets a copy, not a reference. **Leaving orphan processes after the parent exits:** ```js // Wrong: child keeps running after parent crashes const child = spawn('long-running-process'); // Right: clean up on exit process.on('exit', () => child.kill()); // If you want the child to outlive the parent intentionally: const daemon = spawn('process', [], { detached: true }); daemon.unref(); // parent will not wait for it ``` I ran into this with a CLI tool that read binary file metadata via `exec()`. It worked fine in development, then crashed in production whenever files exceeded 1MB. Switching to `spawn()` with piped output took about ten minutes and fixed it for good. ### Real-world usage - Node.js Cluster module: uses `fork()` to spawn one worker per CPU core for HTTP load balancing - Jest and Mocha: run each test suite in a forked process so memory leaks in one suite do not affect others - Webpack and Vite: spawn child processes for compilation steps to keep the file watcher responsive - npm and yarn: use `spawn()` internally when you run `npm run build` to execute the build script - Piscina and similar worker pool libraries: use `fork()` under the hood to maintain a pool of reusable processes for CPU-intensive tasks ### Follow-up questions **Q:** What is the difference between `spawn()` and `fork()`? **A:** `spawn()` launches any OS process (shell command, binary, Python script) with streaming I/O. `fork()` specifically launches a Node.js file and adds an IPC channel for bidirectional messaging via `send()` and `on('message')`. You cannot use `send()` with a process started by `spawn()`. **Q:** Why does `exec()` have a `maxBuffer` limit and how do you work around it? **A:** `exec()` collects all stdout and stderr in memory before calling the callback. The default cap is 1MB. Pass `{ maxBuffer: N }` to increase it, or switch to `spawn()` for anything that might produce more than a few hundred kilobytes. **Q:** How do you add a timeout to a child process? **A:** Pass `{ timeout: 5000 }` to `spawn()` or `exec()` to kill the process after 5 seconds. Or manually: `setTimeout(() => child.kill(), 5000)`. Either way, listen to the `exit` event to confirm the process actually stopped. **Q:** Can a child process outlive its parent? **A:** Yes. Spawn with `{ detached: true }` and call `child.unref()`. The child becomes its own process group leader and keeps running after the parent exits. This is how you create background daemons from a Node.js script. **Q:** What is the performance difference between `fork()` and worker threads? **A:** `fork()` creates a separate OS process with its own V8 instance, around 30MB overhead per process. Worker threads share the same process and V8 heap, closer to 2MB per thread. For true parallelism across CPU cores both work. For shared memory via `SharedArrayBuffer`, only worker threads apply. **Q (Senior):** How would you implement a worker pool with `fork()` and what edge cases would you handle? **A:** Create an array of forked processes, maintain a task queue, and assign work round-robin or by availability. The real complexity is in edge cases: a child crashing mid-task (restart it and requeue), task timeouts (kill the child and retry), memory leaks in long-running children (restart after N tasks), and IPC message ordering (add correlation IDs to requests so responses match the right caller). Libraries like Piscina handle all of this. Rolling your own is a good learning exercise but not something to put in production without thorough testing. ## Examples ### Basic: shell command with exec() ```js const { exec } = require('child_process'); const { promisify } = require('util'); const execAsync = promisify(exec); async function getInstalledPackages() { try { const { stdout } = await execAsync('npm list --depth=0'); return stdout; } catch (err) { console.error('npm list failed:', err.message); return null; } } ``` `promisify(exec)` wraps the callback API into a Promise. The entire output arrives at once in `stdout` because `exec()` buffers it. For `npm list` that is fine since the output is small. ### Intermediate: streaming a large log file with spawn() ```js const { spawn } = require('child_process'); const fs = require('fs'); // Filter error lines from a large log without loading the file into memory const grep = spawn('grep', ['ERROR', '/var/log/app.log']); const output = fs.createWriteStream('errors.txt'); grep.stdout.pipe(output); grep.on('error', (err) => console.error('grep failed to start:', err)); grep.on('close', (code) => { if (code === 0) { console.log('Done, errors.txt written'); } else { console.error(`grep exited with code ${code}`); } }); // Runs immediately, event loop is not blocked console.log('Filtering started...'); ``` `pipe()` connects the child's stdout stream directly to a writable file stream with no intermediate memory buffer. Gigabytes of logs can pass through while the event loop stays free. ### Advanced: fork() with bidirectional messaging and error handling ```js // worker.js - runs in its own process process.on('message', (msg) => { if (msg.cmd === 'sum') { try { const result = msg.data.reduce((a, b) => a + b, 0); process.send({ id: msg.id, result }); } catch (err) { process.send({ id: msg.id, error: err.message }); } } }); ``` ```js // parent.js const { fork } = require('child_process'); const path = require('path'); const child = fork(path.join(__dirname, 'worker.js')); let messageId = 0; const pending = new Map(); function calculate(data) { return new Promise((resolve, reject) => { const id = ++messageId; pending.set(id, { resolve, reject }); child.send({ id, cmd: 'sum', data }); }); } child.on('message', (msg) => { const handler = pending.get(msg.id); if (!handler) return; pending.delete(msg.id); msg.error ? handler.reject(new Error(msg.error)) : handler.resolve(msg.result); }); child.on('error', (err) => console.error('Worker failed to start:', err)); process.on('exit', () => child.kill()); calculate([1, 2, 3, 4, 5]).then((result) => { console.log('Sum:', result); // Sum: 15 child.kill(); }); ``` The `id` field on each message is the correlation key. Without it, two concurrent `calculate()` calls would receive each other's responses. This pattern is the foundation of any production worker pool.For the reviewerNote to the moderator (optional)Visible only to the moderator. Helps review go faster.