What is CDN and why is it needed?

General Questions~4 min read

CDN (Content Delivery Network) is a globally distributed network of proxy servers that caches static assets and delivers them from locations close to the user.

Theory

TL;DR

CDN works like a chain of pizza franchises: your local store has pre-made dough (cache) instead of waiting for a shipment from headquarters across the country
Cuts latency from 200-500ms down to 20-100ms by routing requests to the nearest Point of Presence (PoP)
Offloads 70-95% of static traffic from your origin server
Worth adding for any site with a global audience or heavy static files (images, JS, CSS)

Quick example

html

<!-- No CDN: every request hits your single origin server -->
<link rel="stylesheet" href="/styles.css">
<!-- A US user loading from an EU server: ~400ms latency -->

<!-- CDN: nearby edge server serves a cached copy -->
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/css/bootstrap.min.css">
<!-- Latency: ~50ms, cached globally, served from nearest PoP -->

Open the Network tab in DevTools. The CDN request shows a low Time to First Byte (TTFB). The origin request shows a high one. That gap is what CDN solves.

Without CDN, every asset request hits your single origin server regardless of where the user is. A user in Tokyo loading a file from a server in Frankfurt waits 300-500ms before anything appears on screen. CDN intercepts that request at a Tokyo edge server and serves a cached copy in 20-50ms. Your origin only gets involved on the first request for that file, or after the cache expires.

When to use CDN

Global audience with users in multiple countries: CDN cuts geographic latency by 50-80%
Static-heavy site where images, JS bundles, and fonts make up most of bandwidth: CDN
Traffic spikes around product launches or sales events: CDN handles the load without your server buckling
Local prototype or internal tool for a handful of users: skip it, origin hosting is enough
Pure dynamic API with no cacheable responses: CDN will not help there

How CDN works internally

The browser resolves the CDN hostname via anycast DNS to the nearest PoP. The edge server checks its cache (TTL-based, like Cache-Control: max-age=3600 for JS files). Cache hit: file served from disk or RAM immediately. Cache miss: edge fetches from origin, stores a copy, then serves it.

This pull model is what CloudFront and Cloudflare use by default. Akamai and similar providers also run Varnish-like proxies with LRU eviction on those edge nodes. The important bit is that Cache-Control headers on your origin response control how long the CDN holds the file.

Common mistakes

No Cache-Control headers. If your origin does not send caching headers, the CDN treats every file as uncacheable and forwards all requests to origin. Zero benefit.

// Wrong: no cache headers, CDN can't cache anything
app.use('/static', express.static('public'));

// Right: tell CDN to cache for 1 year
app.use('/static', express.static('public', {
  maxAge: '1y',
  immutable: true
}));

Forgetting cache invalidation after a deploy. You push a new app.js but keep the same filename. The CDN keeps serving the old version until TTL expires, sometimes for days. Fix: use content hashes in filenames (app.abc123.js). Vite and Webpack handle this automatically.

Caching dynamic or personalized content. Setting Cache-Control: public, max-age=3600 on /api/user/profile means the CDN caches one user's data and serves it to everyone else. Use Cache-Control: private, no-cache for anything user-specific.

Real-world usage

React/Vite: production build outputs dist/assets/index-[hash].js, uploaded to CDN with immutable cache headers
Next.js: Vercel's Edge Network automatically CDNs everything under _next/static
Express: put Cloudflare in front of your public/ folder and add cache headers to static responses
Netflix: runs its own CDN called Open Connect that delivers 99% of video traffic from appliances embedded directly in ISP networks

Follow-up questions

Q: How does DNS resolution work in a CDN?

A: CDN providers use anycast routing. The same IP address is announced from hundreds of locations globally, and DNS resolves it to the nearest one. So cdn.cloudflare.com might resolve to a Frankfurt server for a Berlin user and a Dallas server for someone in Houston.

Q: What is the difference between push and pull CDN?

A: Pull CDN (default for CloudFront and Cloudflare) fetches content from origin on the first cache miss, then stores a copy. Push CDN requires you to upload content before users request it. Push makes sense for large files like video manifests where you already know demand is coming.

Q: How do you handle cache invalidation at scale?

A: Path invalidation via the CDN API (/images/*) is the standard approach. Purge-all is a last resort because it triggers a wave of origin fetches. Better to design around it: hashed filenames mean the URL changes with every deploy, so old CDN entries become unreachable without any explicit invalidation.

Q: When does CDN actually hurt performance?

A: Cold starts on the first request after a cache miss add latency because the edge has to go to origin. Over-caching dynamic content causes stale data bugs. Redirect loops happen when CDN and origin both try to redirect the same URL to each other.

Examples

Loading Bootstrap via CDN

The simplest CDN usage: loading a library from a public CDN instead of hosting it yourself.

html

<!DOCTYPE html>
<html>
<head>
  <!-- Served from nearest jsDelivr PoP, already cached globally -->
  <link
    rel="stylesheet"
    href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/css/bootstrap.min.css"
  />
</head>
<body>
  <button class="btn btn-primary">Click me</button>
</body>
</html>

The file is cached on jsDelivr edge servers worldwide. A user in Tokyo gets it from a Tokyo PoP. A user in São Paulo gets it from a local one. Your server is not involved at all.

Cache invalidation with Express and CloudFront

This is where most junior developers hit a real bug. You update a file on S3, but users keep seeing the old version for hours.

const { S3Client, PutObjectCommand } = require('@aws-sdk/client-s3');
const { CloudFrontClient, CreateInvalidationCommand } = require('@aws-sdk/client-cloudfront');

const s3 = new S3Client({ region: 'us-east-1' });
const cf = new CloudFrontClient({ region: 'us-east-1' });

app.post('/update-image', async (req, res) => {
  // Push new file to S3 origin
  await s3.send(new PutObjectCommand({
    Bucket: 'myapp-assets',
    Key: 'img.jpg',
    Body: req.body
  }));

  // Invalidate CloudFront cache so users get the new version
  await cf.send(new CreateInvalidationCommand({
    DistributionId: 'E123ABC',
    InvalidationBatch: {
      Paths: { Quantity: 1, Items: ['/img.jpg'] },
      CallerReference: Date.now().toString()
    }
  }));

  res.json({ ok: true });
});

Without the invalidation call, users see the old img.jpg for up to 24 hours (CloudFront's default TTL). With it, the cache clears in under 5 minutes. I've seen this exact bug go to production in teams that were using S3 and CloudFront together for the first time.

Short Answer

Interview ready

Premium

A concise answer to help you respond confidently on this topic during an interview.

Finished reading?