Suggest an editImprove this articleRefine the answer for “What is a buffer in Node.js?”. Your changes go to moderation before they’re published.Approval requiredContentWhat you’re changing🇺🇸EN🇺🇦UAPreviewTitle (EN)Short answer (EN)**Buffer** in Node.js is a fixed-size array of raw bytes stored outside the V8 heap, for handling binary data from files, streams, and network operations. ```js const buf = Buffer.from('Hello', 'utf8'); console.log(buf); // <Buffer 48 65 6c 6c 6f> console.log(buf.toString()); // "Hello" ``` **Key point:** Buffer stores bytes directly outside V8's garbage collector, making it the right choice for binary I/O rather than plain text.Shown above the full answer for quick recall.Answer (EN)Image**Buffer** is Node.js's fixed-size array of raw bytes, allocated outside the V8 heap, for handling binary data like files, streams, and network packets. ## Theory ### TL;DR - Buffer is like a fixed-length tray of numbered slots: each slot holds one byte (0-255), accessed directly by index - Main difference: Buffer stores raw bytes; JavaScript strings encode text as UTF-16 (minimum 2 bytes per character) - `Buffer.alloc(n)` gives you n zeroed bytes; `Buffer.allocUnsafe(n)` skips zeroing for speed but risks exposing old memory content - Use Buffer for non-text data (images, crypto keys, TCP packets); use strings for readable text - Slicing a Buffer does NOT copy memory: it returns a view into the same bytes ### Quick example ```js // Create, write bytes, read as string const buf = Buffer.alloc(5); // 5 zeroed slots buf[0] = 0x48; // 'H' buf[1] = 0x65; // 'e' buf[2] = 0x6c; // 'l' buf[3] = 0x6c; // 'l' buf[4] = 0x6f; // 'o' console.log(buf.toString('utf8')); // "Hello" console.log(buf); // <Buffer 48 65 6c 6c 6f> // slice returns a view, not a copy const view = buf.slice(0, 3); console.log(view.toString()); // "Hel" ``` `buf[0] = 0x48` writes the byte for 'H' directly into memory. `toString('utf8')` decodes those bytes back to text. The slice shares backing memory with `buf`, so no allocation happens. ### Key difference from strings JavaScript strings are UTF-16: every character takes at least 2 bytes and lives inside V8's heap under garbage collection. A Buffer stores unsigned 8-bit integers (0-255) in memory allocated by libuv, outside V8's heap. No GC pressure, no encoding overhead. When you read a JPEG from disk, you want bytes, not characters. That is exactly where Buffer fits. ### When to use - File I/O with binary formats (images, PDFs, zip files): Buffer - File I/O with text (JSON, CSV, logs): string - TCP/UDP sockets, raw network frames: Buffer - HTTP response bodies that are already text: string - Crypto operations (keys, hashes, ciphertext): always Buffer, strings can silently corrupt bytes through encoding - Streaming large files: process as Buffer chunks, decode to string only when displaying - `Buffer.allocUnsafe` is fine when you immediately overwrite every byte; otherwise use `Buffer.alloc` ### How it works internally Node.js allocates Buffer memory through libuv's raw C memory pools, not through V8. The garbage collector does not touch that memory during normal execution, which avoids GC pauses when you hold gigabytes of video or audio data. V8 exposes Buffers as `Uint8Array`-like objects with methods backed by C++. For example, `buf.write()` calls `uv_buf_init` under the hood. Since Node.js 4, `Buffer` is a subclass of `Uint8Array`, so TypedArray methods work on it directly. Slicing is zero-copy: `buf.slice(0, 5)` returns a pointer offset into the same memory block, not a new allocation. Fast, but mutations to the slice affect the original buffer too. ### Common mistakes **Mistake 1: using `Buffer.allocUnsafe` without overwriting all bytes** ```js const buf = Buffer.allocUnsafe(10); buf.write('hi'); // Bytes at positions 2-9 may contain old process memory console.log(buf); // <Buffer 68 69 XX XX XX XX XX XX XX XX> ``` `allocUnsafe` skips zeroing for performance. If you send this buffer over a network or write it to disk, those unwritten bytes expose whatever was in that memory before. In practice, this is the most common source of subtle data corruption in binary pipelines. I have watched teams spend an afternoon debugging garbled image files before tracing it back to `allocUnsafe` with unwritten tail bytes. Fix: use `Buffer.alloc(10)` or call `buf.fill(0)` right after `allocUnsafe`. **Mistake 2: assuming `slice` copies data** ```js const big = Buffer.alloc(1_000_000); const small = big.slice(0, 10); small[0] = 0xff; console.log(big[0]); // 255 - you mutated big too ``` Slice shares backing memory. If you need an independent copy, use `Buffer.from(small)`. **Mistake 3: slicing mid-emoji with Unicode** ```js const buf = Buffer.from('😊👍', 'utf8'); console.log(buf.length); // 8 bytes (4 per emoji in UTF-8) const broken = buf.slice(0, 3); // cuts through the first emoji console.log(broken.toString('utf8')); // corrupted output const correct = buf.toString('utf8', 0, 4); // full first emoji console.log(correct); // "😊" ``` Buffer operates at byte level, not character level. Multi-byte Unicode characters get corrupted when your offset does not align to a character boundary. **Mistake 4: forgetting the `'hex'` encoding** ```js // Wrong: treats the hex string as UTF-8 text Buffer.from('48656c6c6f'); // Correct Buffer.from('48656c6c6f', 'hex'); // <Buffer 48 65 6c 6c 6f> = "Hello" ``` Default encoding is `'utf8'`. Always pass the second argument when working with hex or base64 input. **Mistake 5: treating Buffer as immutable like strings** ```js const buf = Buffer.from('test'); buf[0] = 0x48; // changes 't' to 'H' console.log(buf.toString()); // "Hest" ``` Buffers are mutable byte arrays. If you need a snapshot, create a copy with `Buffer.from(buf)`. ### Real-world usage - Express + Multer: uploaded files arrive as `req.file.buffer`, hash or virus-scan before writing to disk - Sharp image library: `sharp(buffer).resize().toBuffer()` processes JPEG and PNG entirely in memory - Node crypto module: `crypto.createCipheriv(algorithm, keyBuffer, ivBuffer)` for AES encryption - `fs.createReadStream` emits Buffer chunks; you control when to decode to string - `net.Socket.write(buffer)` for raw TCP frames, zero-copy path through the stack ### Follow-up questions **Q:** What is the difference between `Buffer.alloc` and `Buffer.allocUnsafe`? **A:** `alloc` fills memory with zeros before returning it, safe but slightly slower. `allocUnsafe` skips zeroing for speed, but the bytes contain whatever was in that memory region before. Use `allocUnsafe` only when you are about to overwrite every position immediately. **Q:** Why are Buffers allocated outside the V8 heap? **A:** To avoid GC pauses. If you hold 1 GB of video stream data inside V8's heap, the garbage collector has to scan and track it. Memory managed by libuv is invisible to the GC, so large binary data does not trigger pauses. **Q:** `Buffer` vs `Uint8Array` in Node.js 20. Which one do you use? **A:** `Buffer` is a subclass of `Uint8Array`, so they interoperate. For Node.js I/O APIs (streams, crypto, http), Buffer is the natural choice because those APIs return Buffers. For new utility code that does not touch I/O, `Uint8Array` is portable to browser environments. On sustained 1 GB file I/O, Buffer backed by libuv avoids GC pauses that `Uint8Array` can trigger. **Q:** How does `Buffer.concat` work, and is there a cheaper option? **A:** `Buffer.concat([buf1, buf2])` allocates a new Buffer and copies all data into it, O(n) in total size. There is no zero-copy concat. If you only need to read across multiple buffers, keep them in an array and track offsets manually. **Q:** What happens if you write a large Buffer to a writable stream without handling backpressure? **A:** The writable stream's internal buffer fills up. If the consumer is slow, data accumulates in memory until the process runs out. Fix: use `stream.pipeline()` or check the return value of `write()` and pause the source when it returns `false`. ## Examples ### Basic: string to Buffer and back ```js const str = 'Hello'; const buf = Buffer.from(str, 'utf8'); console.log(buf); // <Buffer 48 65 6c 6c 6f> console.log(buf.length); // 5 console.log(buf[0]); // 72 (decimal for 0x48) console.log(buf.toString('utf8')); // "Hello" console.log(buf.toString('hex')); // "48656c6c6f" console.log(buf.toString('base64')); // "SGVsbG8=" ``` `Buffer.from(str, 'utf8')` encodes each character as its UTF-8 byte value. `toString` decodes it back. For pure ASCII like "Hello", byte count equals character count. Add a non-ASCII character and byte count grows. ### Intermediate: file upload handler with hash check ```js const crypto = require('crypto'); const fs = require('fs'); // Multer stores the uploaded file as req.file.buffer app.post('/upload', (req, res) => { const fileBuffer = req.file.buffer; if (fileBuffer.length > 1_000_000) { return res.status(413).send('File too large'); } const hash = crypto .createHash('sha256') .update(fileBuffer) .digest('hex'); fs.writeFileSync(`uploads/${hash}.jpg`, fileBuffer); res.send({ saved: hash }); }); ``` The file bytes stay as a Buffer from HTTP receipt to disk write. No intermediate string conversion, no encoding issues. The hash is computed directly on the binary data, which is what you need for reliable integrity checks. ### Advanced: multi-byte Unicode slicing edge case ```js const emoji = '😊👍'; const buf = Buffer.from(emoji, 'utf8'); console.log(buf.length); // 8 bytes (each emoji is 4 bytes in UTF-8) console.log(emoji.length); // 2 (JS string counts code units, not bytes) // Wrong: byte offset 3 is inside the first emoji const broken = buf.slice(0, 3); console.log(broken.toString('utf8')); // corrupted output // Right: align to the 4-byte character boundary console.log(buf.toString('utf8', 0, 4)); // "😊" console.log(buf.toString('utf8', 4, 8)); // "👍" ``` This catches developers who assume byte length equals character length. For multi-byte characters, always compute offsets using `Buffer.byteLength(str, 'utf8')` or work at the string level first, then convert to Buffer.For the reviewerNote to the moderator (optional)Visible only to the moderator. Helps review go faster.