Skip to main content

What is a buffer in Node.js?

Buffer is Node.js's fixed-size array of raw bytes, allocated outside the V8 heap, for handling binary data like files, streams, and network packets.

Theory

TL;DR

  • Buffer is like a fixed-length tray of numbered slots: each slot holds one byte (0-255), accessed directly by index
  • Main difference: Buffer stores raw bytes; JavaScript strings encode text as UTF-16 (minimum 2 bytes per character)
  • Buffer.alloc(n) gives you n zeroed bytes; Buffer.allocUnsafe(n) skips zeroing for speed but risks exposing old memory content
  • Use Buffer for non-text data (images, crypto keys, TCP packets); use strings for readable text
  • Slicing a Buffer does NOT copy memory: it returns a view into the same bytes

Quick example

js
// Create, write bytes, read as string const buf = Buffer.alloc(5); // 5 zeroed slots buf[0] = 0x48; // 'H' buf[1] = 0x65; // 'e' buf[2] = 0x6c; // 'l' buf[3] = 0x6c; // 'l' buf[4] = 0x6f; // 'o' console.log(buf.toString('utf8')); // "Hello" console.log(buf); // <Buffer 48 65 6c 6c 6f> // slice returns a view, not a copy const view = buf.slice(0, 3); console.log(view.toString()); // "Hel"

buf[0] = 0x48 writes the byte for 'H' directly into memory. toString('utf8') decodes those bytes back to text. The slice shares backing memory with buf, so no allocation happens.

Key difference from strings

JavaScript strings are UTF-16: every character takes at least 2 bytes and lives inside V8's heap under garbage collection. A Buffer stores unsigned 8-bit integers (0-255) in memory allocated by libuv, outside V8's heap. No GC pressure, no encoding overhead. When you read a JPEG from disk, you want bytes, not characters. That is exactly where Buffer fits.

When to use

  • File I/O with binary formats (images, PDFs, zip files): Buffer
  • File I/O with text (JSON, CSV, logs): string
  • TCP/UDP sockets, raw network frames: Buffer
  • HTTP response bodies that are already text: string
  • Crypto operations (keys, hashes, ciphertext): always Buffer, strings can silently corrupt bytes through encoding
  • Streaming large files: process as Buffer chunks, decode to string only when displaying
  • Buffer.allocUnsafe is fine when you immediately overwrite every byte; otherwise use Buffer.alloc

How it works internally

Node.js allocates Buffer memory through libuv's raw C memory pools, not through V8. The garbage collector does not touch that memory during normal execution, which avoids GC pauses when you hold gigabytes of video or audio data. V8 exposes Buffers as Uint8Array-like objects with methods backed by C++. For example, buf.write() calls uv_buf_init under the hood. Since Node.js 4, Buffer is a subclass of Uint8Array, so TypedArray methods work on it directly.

Slicing is zero-copy: buf.slice(0, 5) returns a pointer offset into the same memory block, not a new allocation. Fast, but mutations to the slice affect the original buffer too.

Common mistakes

Mistake 1: using Buffer.allocUnsafe without overwriting all bytes

js
const buf = Buffer.allocUnsafe(10); buf.write('hi'); // Bytes at positions 2-9 may contain old process memory console.log(buf); // <Buffer 68 69 XX XX XX XX XX XX XX XX>

allocUnsafe skips zeroing for performance. If you send this buffer over a network or write it to disk, those unwritten bytes expose whatever was in that memory before. In practice, this is the most common source of subtle data corruption in binary pipelines. I have watched teams spend an afternoon debugging garbled image files before tracing it back to allocUnsafe with unwritten tail bytes. Fix: use Buffer.alloc(10) or call buf.fill(0) right after allocUnsafe.

Mistake 2: assuming slice copies data

js
const big = Buffer.alloc(1_000_000); const small = big.slice(0, 10); small[0] = 0xff; console.log(big[0]); // 255 - you mutated big too

Slice shares backing memory. If you need an independent copy, use Buffer.from(small).

Mistake 3: slicing mid-emoji with Unicode

js
const buf = Buffer.from('😊👍', 'utf8'); console.log(buf.length); // 8 bytes (4 per emoji in UTF-8) const broken = buf.slice(0, 3); // cuts through the first emoji console.log(broken.toString('utf8')); // corrupted output const correct = buf.toString('utf8', 0, 4); // full first emoji console.log(correct); // "😊"

Buffer operates at byte level, not character level. Multi-byte Unicode characters get corrupted when your offset does not align to a character boundary.

Mistake 4: forgetting the 'hex' encoding

js
// Wrong: treats the hex string as UTF-8 text Buffer.from('48656c6c6f'); // Correct Buffer.from('48656c6c6f', 'hex'); // <Buffer 48 65 6c 6c 6f> = "Hello"

Default encoding is 'utf8'. Always pass the second argument when working with hex or base64 input.

Mistake 5: treating Buffer as immutable like strings

js
const buf = Buffer.from('test'); buf[0] = 0x48; // changes 't' to 'H' console.log(buf.toString()); // "Hest"

Buffers are mutable byte arrays. If you need a snapshot, create a copy with Buffer.from(buf).

Real-world usage

  • Express + Multer: uploaded files arrive as req.file.buffer, hash or virus-scan before writing to disk
  • Sharp image library: sharp(buffer).resize().toBuffer() processes JPEG and PNG entirely in memory
  • Node crypto module: crypto.createCipheriv(algorithm, keyBuffer, ivBuffer) for AES encryption
  • fs.createReadStream emits Buffer chunks; you control when to decode to string
  • net.Socket.write(buffer) for raw TCP frames, zero-copy path through the stack

Follow-up questions

Q: What is the difference between Buffer.alloc and Buffer.allocUnsafe?
A: alloc fills memory with zeros before returning it, safe but slightly slower. allocUnsafe skips zeroing for speed, but the bytes contain whatever was in that memory region before. Use allocUnsafe only when you are about to overwrite every position immediately.

Q: Why are Buffers allocated outside the V8 heap?
A: To avoid GC pauses. If you hold 1 GB of video stream data inside V8's heap, the garbage collector has to scan and track it. Memory managed by libuv is invisible to the GC, so large binary data does not trigger pauses.

Q: Buffer vs Uint8Array in Node.js 20. Which one do you use?
A: Buffer is a subclass of Uint8Array, so they interoperate. For Node.js I/O APIs (streams, crypto, http), Buffer is the natural choice because those APIs return Buffers. For new utility code that does not touch I/O, Uint8Array is portable to browser environments. On sustained 1 GB file I/O, Buffer backed by libuv avoids GC pauses that Uint8Array can trigger.

Q: How does Buffer.concat work, and is there a cheaper option?
A: Buffer.concat([buf1, buf2]) allocates a new Buffer and copies all data into it, O(n) in total size. There is no zero-copy concat. If you only need to read across multiple buffers, keep them in an array and track offsets manually.

Q: What happens if you write a large Buffer to a writable stream without handling backpressure?
A: The writable stream's internal buffer fills up. If the consumer is slow, data accumulates in memory until the process runs out. Fix: use stream.pipeline() or check the return value of write() and pause the source when it returns false.

Examples

Basic: string to Buffer and back

js
const str = 'Hello'; const buf = Buffer.from(str, 'utf8'); console.log(buf); // <Buffer 48 65 6c 6c 6f> console.log(buf.length); // 5 console.log(buf[0]); // 72 (decimal for 0x48) console.log(buf.toString('utf8')); // "Hello" console.log(buf.toString('hex')); // "48656c6c6f" console.log(buf.toString('base64')); // "SGVsbG8="

Buffer.from(str, 'utf8') encodes each character as its UTF-8 byte value. toString decodes it back. For pure ASCII like "Hello", byte count equals character count. Add a non-ASCII character and byte count grows.

Intermediate: file upload handler with hash check

js
const crypto = require('crypto'); const fs = require('fs'); // Multer stores the uploaded file as req.file.buffer app.post('/upload', (req, res) => { const fileBuffer = req.file.buffer; if (fileBuffer.length > 1_000_000) { return res.status(413).send('File too large'); } const hash = crypto .createHash('sha256') .update(fileBuffer) .digest('hex'); fs.writeFileSync(`uploads/${hash}.jpg`, fileBuffer); res.send({ saved: hash }); });

The file bytes stay as a Buffer from HTTP receipt to disk write. No intermediate string conversion, no encoding issues. The hash is computed directly on the binary data, which is what you need for reliable integrity checks.

Advanced: multi-byte Unicode slicing edge case

js
const emoji = '😊👍'; const buf = Buffer.from(emoji, 'utf8'); console.log(buf.length); // 8 bytes (each emoji is 4 bytes in UTF-8) console.log(emoji.length); // 2 (JS string counts code units, not bytes) // Wrong: byte offset 3 is inside the first emoji const broken = buf.slice(0, 3); console.log(broken.toString('utf8')); // corrupted output // Right: align to the 4-byte character boundary console.log(buf.toString('utf8', 0, 4)); // "😊" console.log(buf.toString('utf8', 4, 8)); // "👍"

This catches developers who assume byte length equals character length. For multi-byte characters, always compute offsets using Buffer.byteLength(str, 'utf8') or work at the string level first, then convert to Buffer.

Short Answer

Interview ready
Premium

A concise answer to help you respond confidently on this topic during an interview.

Finished reading?