Skip to main content

What is WebSocket and how does it work?

WebSocket is a protocol that upgrades an HTTP connection to a persistent, full-duplex TCP channel where both client and server send messages at any time without repeated requests.

Theory

TL;DR

  • HTTP is like mailing a letter: you send, server replies, connection closes. WebSocket is like a phone call: line stays open, both sides talk freely.
  • Main difference: one persistent TCP connection vs repeated request-response cycles with TCP handshake overhead on every round trip.
  • The handshake starts as HTTP, upgrades via a 101 Switching Protocols response, then stays open until someone closes it.
  • Use WebSocket when you need sub-100ms latency or server-to-client push without waiting for a client request. Otherwise SSE or HTTP/2 streams are enough.
  • ws:// for local dev, always wss:// in production. Browsers block ws:// on HTTPS pages.

Quick example

javascript
const ws = new WebSocket('wss://echo.websocket.org'); ws.onopen = () => { console.log('Connected'); // fires once, line is open ws.send('Ping'); // send anytime - no request cycle needed }; ws.onmessage = (e) => console.log('Echo:', e.data); // Output: Echo: Ping ws.onerror = (e) => console.error('Error:', e); ws.onclose = () => console.log('Disconnected');

Four handlers cover the full socket lifecycle. The connection stays open after onopen fires - no polling, no repeated requests.

How the handshake works

The client sends a regular HTTP/1.1 GET with two extra headers:

GET /chat HTTP/1.1 Upgrade: websocket Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==

The server responds with 101 Switching Protocols and a Sec-WebSocket-Accept header. That header is a SHA1 hash of the client's key combined with a fixed GUID (258EAFA5-E914-47DA-95CA-C5AB0DC85B11), encoded in base64. The client verifies the hash, and from that point the TCP connection speaks WebSocket framing. HTTP is done.

Data travels in frames after the upgrade. Each frame has an opcode (text, binary, ping, pong, close), a mask bit, and a payload. Browser clients always mask their frames to prevent cache poisoning by proxies. Servers send unmasked. A FIN bit marks the last frame of a message.

Key difference from HTTP

HTTP is stateless by design. Every request opens a connection, sends a query, gets a response, closes. For real-time apps this means polling: the client asks every second "any news?" - most replies are empty, every round trip burns 150-200ms. Long polling holds the connection until data arrives, but the server still cannot push freely.

WebSocket removes that constraint. After the upgrade, the server pushes data the moment something happens. Latency drops from 200ms+ polling cycles to under 10ms. Both sides send frames independently.

When to use

  • Live chat (Slack, Discord): typing indicators, instant delivery - WebSocket.
  • Multiplayer games: position sync at 60fps requires WebSocket. Nothing else is fast enough.
  • Trading dashboards: price updates every 100ms, clients also send orders - WebSocket over SSE.
  • Collaborative editing (Figma, Google Docs): cursor positions, concurrent ops - WebSocket.
  • One-way notifications: SSE is simpler (server-to-client only, built-in auto-reconnect).
  • Updates once per minute: plain HTTP polling. No reason to hold a persistent connection.

Comparison: WebSocket vs alternatives

FeatureHTTP PollingLong PollingSSEWebSocket
DirectionClient-initiated onlyClient-initiated onlyServer to client onlyFull-duplex
ConnectionNew per requestHeld until data arrivesPersistent, auto-reconnectPersistent, manual close
OverheadHighMediumLowLowest
Browser supportFullFullAll modern browsers95%+ globally
Best forStatic pagesSimple real-timeLive feeds, notificationsChat, games, collaboration
When to useRare updatesNo WebSocket optionServer-push onlyBidirectional, low latency

SSE handles the server-push case well and reconnects automatically. WebSocket is the right call when the client also sends data frequently.

How frames work internally

The ws package for Node.js and the browser's native WebSocket API both sit on top of the OS TCP stack. A frame starts with a 2-byte header: 1 bit FIN, 3 reserved bits, 4 bits opcode, 1 bit MASK, 7 bits payload length (extended for larger payloads). Browser clients mask payloads with a 4-byte key. Servers must unmask before reading.

Ping and pong frames (opcodes 0x9 and 0xA) are the heartbeat mechanism. The server sends a ping, the browser responds with a pong automatically. No pong within the timeout window means the connection is dead and should be terminated.

Common mistakes

No reconnect logic

javascript
const ws = new WebSocket(url); // no onclose handler

A network blip closes the socket. The app stops receiving data, the user sees nothing. Fix:

javascript
ws.onclose = () => setTimeout(() => connect(url), 1000);

Sending objects without JSON.stringify

javascript
ws.send({ msg: 'hi' }); // sends "[object Object]" - server parse fails

WebSocket sends strings or binary. Objects need serialization: ws.send(JSON.stringify({ msg: 'hi' })).

No heartbeat in production

I've seen this kill more deployments than any other WebSocket issue. Everything works on localhost, then Nginx closes the connection in staging. Corporate proxies and Nginx with default config terminate idle connections after 60 seconds. Send a ping every 30 seconds:

javascript
setInterval(() => { if (ws.readyState === WebSocket.OPEN) ws.send('ping'); }, 30000);

No message size limit on the server

javascript
ws.on('message', (msg) => { if (msg.length > 1_000_000) return ws.close(1009, 'Too large'); // process msg });

Without this, a client can send a 1GB frame and crash the process.

Using ws:// on HTTPS pages Browsers block mixed content. On HTTPS, always use wss://.

Real-world usage

  • Socket.io (React/Node): WebSocket with polling fallback and room abstraction. Powers Trello live cursors.
  • Phoenix Channels (Elixir): Discord-style chat at scale, handles 10k+ users per room.
  • ActionCable (Rails): GitHub notification system.
  • uWebSockets.js: reportedly the backbone of WhatsApp Web for millions of concurrent connections.
  • Native ws (Node): most common choice for Express-based APIs and internal dashboards.

Follow-up questions

Q: What does the 101 Switching Protocols response actually contain?
A: The Sec-WebSocket-Accept header: a SHA1 hash of the client's Sec-WebSocket-Key plus a fixed GUID, encoded in base64. The client verifies this value before switching out of HTTP mode.

Q: What happens when the network drops mid-session?
A: onclose fires with code 1006 (abnormal closure, no close frame received). The WebSocket API has no automatic reconnect. You implement exponential backoff yourself.

Q: WebSocket vs HTTP/2 multiplexing - which to choose?
A: HTTP/2 multiplexes many request-response streams over one TCP connection, but each stream is still request-response. WebSocket gives true bidirectional push on one stream. For server-initiated messages, WebSocket wins. For parallel API requests, HTTP/2 is the better fit.

Q: How do you proxy WebSocket through Nginx?
A: Add proxy_http_version 1.1; and proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; to your config. Without these, Nginx strips the Upgrade header and the handshake fails with a 400.

Q: 1 million concurrent WebSocket connections - how do you scale without breaking session stickiness?
A: Consistent hashing on client ID routes each connection to a specific node (sticky sessions). Messages cross nodes through a pub/sub layer like Redis. Heartbeats keep connections alive through the load balancer. OS tuning matters too: net.core.somaxconn, open file descriptor limits. "Use PM2" is not the answer here - PM2 does not solve cross-process message routing.

Examples

Basic browser client with reconnect

javascript
function connect(url) { const ws = new WebSocket(url); ws.onopen = () => { console.log('Connected'); ws.send(JSON.stringify({ type: 'join', room: 'general' })); }; ws.onmessage = (e) => { const data = JSON.parse(e.data); console.log(`${data.user}: ${data.text}`); // Output: Alice: hello }; ws.onclose = () => { console.log('Reconnecting in 2s...'); setTimeout(() => connect(url), 2000); }; return ws; } const ws = connect('wss://yourapi.com/chat');

The onclose reconnect loop is the minimal viable pattern. In production, use exponential backoff: start at 1s, double each retry, cap at 30s.

React chat room with WebSocket

javascript
import { useState, useEffect, useRef } from 'react'; function ChatRoom({ roomId }) { const [messages, setMessages] = useState([]); const ws = useRef(null); useEffect(() => { ws.current = new WebSocket(`wss://yourapi.com/chat/${roomId}`); ws.current.onmessage = (e) => { const msg = JSON.parse(e.data); setMessages((prev) => [...prev, msg]); // Output: adds { user: 'Bob', text: 'Hi' } }; return () => ws.current?.close(); // close on unmount - prevents memory leak }, [roomId]); const sendMessage = (text) => { if (ws.current?.readyState === WebSocket.OPEN) { ws.current.send(JSON.stringify({ text })); } }; return ( <div> {messages.map((m, i) => <p key={i}>{m.user}: {m.text}</p>)} <input onKeyDown={(e) => e.key === 'Enter' && sendMessage(e.target.value)} /> </div> ); }

useRef keeps the socket instance across renders without triggering re-renders. The readyState === WebSocket.OPEN check prevents send errors during reconnection.

Node.js server with heartbeat and broadcast

javascript
const WebSocket = require('ws'); const wss = new WebSocket.Server({ port: 8080 }); wss.on('connection', (ws) => { ws.isAlive = true; ws.on('pong', () => { ws.isAlive = true; }); // reset flag on pong ws.on('message', (raw) => { if (raw.length > 1_000_000) return ws.close(1009, 'Message too large'); let msg; try { msg = JSON.parse(raw); } catch { return ws.close(1007, 'Invalid JSON'); } wss.clients.forEach((client) => { if (client.readyState === WebSocket.OPEN) { client.send(JSON.stringify(msg)); // broadcast to all connected clients } }); }); }); const interval = setInterval(() => { wss.clients.forEach((ws) => { if (!ws.isAlive) return ws.terminate(); // no pong = dead, drop it ws.isAlive = false; ws.ping(); }); }, 30000); wss.on('close', () => clearInterval(interval)); // cleanup on server shutdown

The isAlive flag resets to false on each tick and comes back true only when a pong arrives. No pong in 30 seconds means the client is gone. The clearInterval in wss.on('close') stops the interval after the server shuts down.

Short Answer

Interview ready
Premium

A concise answer to help you respond confidently on this topic during an interview.

Finished reading?