Skip to main content

How does Docker build cache work and how to manage it?

Docker build cache is the difference between a 60-second rebuild and a 2-second one. Knowing how the cache key is computed and how to keep it valid is the single biggest skill for fast Dockerfiles.

Theory

TL;DR

  • After each instruction, Docker stores the resulting layer in a cache.
  • On rebuild, Docker computes a cache key for each instruction. Match → reuse the layer; mismatch → re-execute and invalidate everything below.
  • Cache key components:
    • Previous layer's digest (the chain matters)
    • The instruction text itself
    • For COPY and ADD: the digest of every file being copied
    • For RUN: just the command string. Docker does NOT inspect what the command does.
  • Order matters: put stable, expensive steps high; volatile, frequently-changing steps low.
  • BuildKit cache mounts (RUN --mount=type=cache,target=/path) persist a cache across builds without becoming part of any layer.
  • --no-cache rebuilds everything from scratch.

How cache invalidation works

FROM alpine:3.21 ← cached if alpine:3.21 unchanged WORKDIR /app ← cached if FROM unchanged COPY package.json ./ ← cached if package.json bytes unchanged RUN npm ci ← cached if previous step cache hit COPY src/ ./src/ ← invalidates if any file in src/ changed CMD ["node", "server.js"] ← cached if previous step cache hit

The key insight: Docker hashes file contents for COPY/ADD but not for RUN command outputs. RUN apt-get install curl cache-hits even if upstream apt has a new curl version.

Optimizing instruction order

dockerfile
# WRONG: source copied before deps installed FROM node:22-alpine WORKDIR /app COPY . . # any file change invalidates everything below RUN npm ci --omit=dev # re-runs every code change CMD ["node", "server.js"] # RIGHT: deps first, source last FROM node:22-alpine WORKDIR /app COPY package*.json ./ # changes only when deps change RUN npm ci --omit=dev # cached unless package*.json changed COPY . . # changes when source changes; only this re-runs CMD ["node", "server.js"]

For a typical app with stable deps, this turns rebuild time from 60 seconds (the wrong way) to 2 seconds (the right way).

BuildKit cache mounts

With BuildKit (default in modern Docker), you can mount a cache directory that persists across builds without being part of the image:

dockerfile
# syntax=docker/dockerfile:1.7 FROM python:3.13-slim WORKDIR /app COPY requirements.txt . RUN --mount=type=cache,target=/root/.cache/pip \ pip install --no-cache-dir -r requirements.txt COPY . . CMD ["python", "app.py"]

The pip wheel cache lives outside the layer. Build #2 with the same requirements.txt reuses the wheels even though the layer itself was rebuilt. Layer stays clean; wheels stay cached.

Common cache-mount targets:

  • pip: /root/.cache/pip
  • npm: /root/.npm
  • apt: /var/cache/apt and /var/lib/apt/lists with sharing=locked
  • Go modules: /go/pkg/mod
  • Cargo: /usr/local/cargo/registry

Sharing cache between builds (CI)

With BuildKit + docker buildx, you can export and import cache to a registry, so CI builds reuse cache across runners:

bash
# First build: write cache to registry docker buildx build \ --cache-to type=registry,ref=myreg/myapp:cache,mode=max \ --cache-from type=registry,ref=myreg/myapp:cache \ -t myreg/myapp:1.0 \ --push . # Subsequent builds (different runner) read from the same cache docker buildx build \ --cache-from type=registry,ref=myreg/myapp:cache \ -t myreg/myapp:1.1 \ --push .

A cold runner now starts as warm as the last successful build. Massive CI speedup for projects with heavy build steps.

Bypassing the cache

bash
# Rebuild everything from scratch docker build --no-cache -t myapp . # Refresh just the FROM (re-pull the base image) docker build --pull -t myapp . # Both docker build --pull --no-cache -t myapp . # Invalidate from a specific instruction onwards (BuildKit) # Use a build arg whose value changes: --build-arg BUILD_REV=$(date +%s)

Common mistakes

COPY . . before RUN install

Covered above. The single most common cache-killer.

Putting apt-get update in a separate RUN from apt-get install

dockerfile
# WRONG: update can cache hit while install pulls a stale package list RUN apt-get update RUN apt-get install -y --no-install-recommends curl # RIGHT: keep them in one RUN so they always run together RUN apt-get update && \ apt-get install -y --no-install-recommends curl && \ rm -rf /var/lib/apt/lists/*

If apt-get update is cached and apt-get install runs, you can install from a stale package list — packages may be missing.

Mounting source code that triggers cache invalidation on every save

dockerfile
COPY . . # invalidated by editor save in any file

For dev environments, use bind mounts at run time instead. For CI builds, accept that source changes invalidate later layers and design around it (deps first).

Forgetting that RUN does not look inside the command

dockerfile
RUN curl https://example.com/installer.sh | sh # Same RUN string forever; never refreshes even if installer.sh changes.

Docker's cache key for RUN is the literal command. To force re-execution, change the string somehow:

dockerfile
ARG INSTALLER_SHA="abc123..." RUN curl https://example.com/installer.sh -o /tmp/i.sh && \ echo "$INSTALLER_SHA /tmp/i.sh" | sha256sum -c && \ sh /tmp/i.sh # Now changing INSTALLER_SHA invalidates this layer.

Inspecting and managing cache

bash
# See cache usage docker system df # high-level docker buildx du # build cache details # Prune build cache docker builder prune # interactive docker builder prune -af # all, unconditional docker builder prune --filter 'until=72h' # older than 3 days # Show what BuildKit considered cached DOCKER_BUILDKIT=1 docker build --progress=plain -t myapp . # Output shows CACHED for hits, RUN for misses

Real-world usage

  • Local dev: dep-install layer cached → 2-second rebuilds for code changes. Productivity multiplier.
  • CI: --cache-from registry to bring last build's cache to a fresh runner. Cuts 10-minute builds to 90 seconds.
  • Cache mounts for package managers: pip/npm/apt caches persist across builds without bloating image.
  • Build farms (Bazel-style): the cache is shipped as a registry artifact; many builders share one cache.

Follow-up questions

Q: Why does my CI build never hit cache, even when nothing changed?


A: Each CI runner starts clean — no local cache. Use --cache-from to read cache from a registry that survives across runs.

Q: What is the difference between BuildKit cache mounts and image layers?


A: Layers are part of the image. Cache mounts are not — they live in a separate cache, attached at build time. Mounts are how you keep build-time caches (npm packages, pip wheels) without bloating your final image with files you only needed to compile.

Q: How do I invalidate just the latter half of a Dockerfile?


A: Add a ARG CACHEBUST=1 line at the right point and pass --build-arg CACHEBUST=$(date +%s). The next build will see a different value and invalidate from there down.

Q: Does --pull invalidate everything?


A: Only if the base image actually has a new digest. --pull re-checks FROM, but if node:22-alpine resolves to the same digest as last time, the FROM stays cached and so does everything after.

Q: (Senior) How would you set up cache-from in a GitHub Actions matrix build?


A: Use docker/build-push-action@v5 with cache-from: type=gha and cache-to: type=gha,mode=max. GitHub Actions provides a built-in cache backend per repo. For more aggressive cross-job sharing, use type=registry,ref=ghcr.io/myorg/myapp:cache. Avoid type=local in CI — runners are ephemeral.

Examples

Optimal Node Dockerfile

dockerfile
# syntax=docker/dockerfile:1.7 FROM node:22-alpine AS deps WORKDIR /app COPY package*.json ./ RUN --mount=type=cache,target=/root/.npm \ npm ci FROM deps AS build COPY . . RUN npm run build FROM node:22-alpine WORKDIR /app COPY --from=build /app/dist ./dist COPY --from=deps /app/node_modules ./node_modules USER node CMD ["node", "dist/server.js"]
  • Stage deps only invalidates when package*.json changes.
  • npm cache mount survives between builds.
  • Source changes only re-run the build stage.

CI-shared cache via registry

yaml
# .github/workflows/build.yml - uses: docker/build-push-action@v5 with: push: true tags: myorg/myapp:${{ github.sha }} cache-from: type=registry,ref=myorg/myapp:cache cache-to: type=registry,ref=myorg/myapp:cache,mode=max

First run populates myorg/myapp:cache. Every subsequent run on any runner reuses it. Build times drop dramatically.

Short Answer

Interview ready
Premium

A concise answer to help you respond confidently on this topic during an interview.

Comments

No comments yet