Suggest an edit

Improve this article

Refine the answer for “How does Docker build cache work and how to manage it?”. Your changes go to moderation before they’re published.

Approval required

Content

What you’re changing

Title (EN)

Short answer (EN)

Shown above the full answer for quick recall.

Answer (EN)

**Docker build cache** is the difference between a 60-second rebuild and a 2-second one. Knowing how the cache key is computed and how to keep it valid is the single biggest skill for fast Dockerfiles.

## Theory

### TL;DR

- After each instruction, Docker stores the resulting layer in a cache.
- On rebuild, Docker computes a **cache key** for each instruction. Match → reuse the layer; mismatch → re-execute and invalidate everything below.
- **Cache key components:**
  - Previous layer's digest (the chain matters)
  - The instruction text itself
  - For `COPY` and `ADD`: the digest of every file being copied
  - For `RUN`: just the command string. Docker does NOT inspect what the command does.
- **Order matters:** put stable, expensive steps high; volatile, frequently-changing steps low.
- **BuildKit cache mounts** (`RUN --mount=type=cache,target=/path`) persist a cache across builds without becoming part of any layer.
- `--no-cache` rebuilds everything from scratch.

### How cache invalidation works

```
FROM alpine:3.21              ← cached if alpine:3.21 unchanged
WORKDIR /app                  ← cached if FROM unchanged
COPY package.json ./          ← cached if package.json bytes unchanged
RUN npm ci                    ← cached if previous step cache hit
COPY src/ ./src/              ← invalidates if any file in src/ changed
CMD ["node", "server.js"]     ← cached if previous step cache hit
```

The key insight: **Docker hashes file contents for COPY/ADD** but **not for RUN command outputs**. `RUN apt-get install curl` cache-hits even if upstream apt has a new curl version.

### Optimizing instruction order

```dockerfile
# WRONG: source copied before deps installed
FROM node:22-alpine
WORKDIR /app
COPY . .                     # any file change invalidates everything below
RUN npm ci --omit=dev        # re-runs every code change
CMD ["node", "server.js"]

# RIGHT: deps first, source last
FROM node:22-alpine
WORKDIR /app
COPY package*.json ./        # changes only when deps change
RUN npm ci --omit=dev        # cached unless package*.json changed
COPY . .                     # changes when source changes; only this re-runs
CMD ["node", "server.js"]
```

For a typical app with stable deps, this turns rebuild time from 60 seconds (the wrong way) to 2 seconds (the right way).

### BuildKit cache mounts

With BuildKit (default in modern Docker), you can mount a cache directory that persists across builds without being part of the image:

```dockerfile
# syntax=docker/dockerfile:1.7
FROM python:3.13-slim
WORKDIR /app
COPY requirements.txt .
RUN --mount=type=cache,target=/root/.cache/pip \
    pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "app.py"]
```

The pip wheel cache lives outside the layer. Build #2 with the same `requirements.txt` reuses the wheels even though the layer itself was rebuilt. Layer stays clean; wheels stay cached.

Common cache-mount targets:
- pip: `/root/.cache/pip`
- npm: `/root/.npm`
- apt: `/var/cache/apt` and `/var/lib/apt/lists` with `sharing=locked`
- Go modules: `/go/pkg/mod`
- Cargo: `/usr/local/cargo/registry`

### Sharing cache between builds (CI)

With BuildKit + `docker buildx`, you can export and import cache to a registry, so CI builds reuse cache across runners:

```bash
# First build: write cache to registry
docker buildx build \
    --cache-to type=registry,ref=myreg/myapp:cache,mode=max \
    --cache-from type=registry,ref=myreg/myapp:cache \
    -t myreg/myapp:1.0 \
    --push .

# Subsequent builds (different runner) read from the same cache
docker buildx build \
    --cache-from type=registry,ref=myreg/myapp:cache \
    -t myreg/myapp:1.1 \
    --push .
```

A cold runner now starts as warm as the last successful build. Massive CI speedup for projects with heavy build steps.

### Bypassing the cache

```bash
# Rebuild everything from scratch
docker build --no-cache -t myapp .

# Refresh just the FROM (re-pull the base image)
docker build --pull -t myapp .

# Both
docker build --pull --no-cache -t myapp .

# Invalidate from a specific instruction onwards (BuildKit)
#  Use a build arg whose value changes: --build-arg BUILD_REV=$(date +%s)
```

### Common mistakes

**`COPY . .` before `RUN install`**

Covered above. The single most common cache-killer.

**Putting `apt-get update` in a separate RUN from `apt-get install`**

```dockerfile
# WRONG: update can cache hit while install pulls a stale package list
RUN apt-get update
RUN apt-get install -y --no-install-recommends curl

# RIGHT: keep them in one RUN so they always run together
RUN apt-get update && \
    apt-get install -y --no-install-recommends curl && \
    rm -rf /var/lib/apt/lists/*
```

If apt-get update is cached and apt-get install runs, you can install from a stale package list — packages may be missing.

**Mounting source code that triggers cache invalidation on every save**

```dockerfile
COPY . .                     # invalidated by editor save in any file
```

For dev environments, use bind mounts at run time instead. For CI builds, accept that source changes invalidate later layers and design around it (deps first).

**Forgetting that `RUN` does not look inside the command**

```dockerfile
RUN curl https://example.com/installer.sh | sh
# Same RUN string forever; never refreshes even if installer.sh changes.
```

Docker's cache key for `RUN` is the literal command. To force re-execution, change the string somehow:

```dockerfile
ARG INSTALLER_SHA="abc123..."
RUN curl https://example.com/installer.sh -o /tmp/i.sh && \
    echo "$INSTALLER_SHA  /tmp/i.sh" | sha256sum -c && \
    sh /tmp/i.sh
# Now changing INSTALLER_SHA invalidates this layer.
```

### Inspecting and managing cache

```bash
# See cache usage
docker system df               # high-level
docker buildx du               # build cache details

# Prune build cache
docker builder prune            # interactive
docker builder prune -af        # all, unconditional
docker builder prune --filter 'until=72h'  # older than 3 days

# Show what BuildKit considered cached
DOCKER_BUILDKIT=1 docker build --progress=plain -t myapp .
# Output shows CACHED for hits, RUN for misses
```

### Real-world usage

- **Local dev:** dep-install layer cached → 2-second rebuilds for code changes. Productivity multiplier.
- **CI:** `--cache-from registry` to bring last build's cache to a fresh runner. Cuts 10-minute builds to 90 seconds.
- **Cache mounts for package managers:** pip/npm/apt caches persist across builds without bloating image.
- **Build farms (Bazel-style):** the cache is shipped as a registry artifact; many builders share one cache.

### Follow-up questions

**Q:** Why does my CI build never hit cache, even when nothing changed?

**A:** Each CI runner starts clean — no local cache. Use `--cache-from` to read cache from a registry that survives across runs.

**Q:** What is the difference between BuildKit cache mounts and image layers?

**A:** Layers are part of the image. Cache mounts are not — they live in a separate cache, attached at build time. Mounts are how you keep build-time caches (npm packages, pip wheels) without bloating your final image with files you only needed to compile.

**Q:** How do I invalidate just the latter half of a Dockerfile?

**A:** Add a `ARG CACHEBUST=1` line at the right point and pass `--build-arg CACHEBUST=$(date +%s)`. The next build will see a different value and invalidate from there down.

**Q:** Does `--pull` invalidate everything?

**A:** Only if the base image actually has a new digest. `--pull` re-checks `FROM`, but if `node:22-alpine` resolves to the same digest as last time, the FROM stays cached and so does everything after.

**Q:** (Senior) How would you set up cache-from in a GitHub Actions matrix build?

**A:** Use `docker/build-push-action@v5` with `cache-from: type=gha` and `cache-to: type=gha,mode=max`. GitHub Actions provides a built-in cache backend per repo. For more aggressive cross-job sharing, use `type=registry,ref=ghcr.io/myorg/myapp:cache`. Avoid `type=local` in CI — runners are ephemeral.

## Examples

### Optimal Node Dockerfile

```dockerfile
# syntax=docker/dockerfile:1.7
FROM node:22-alpine AS deps
WORKDIR /app
COPY package*.json ./
RUN --mount=type=cache,target=/root/.npm \
    npm ci

FROM deps AS build
COPY . .
RUN npm run build

FROM node:22-alpine
WORKDIR /app
COPY --from=build /app/dist ./dist
COPY --from=deps /app/node_modules ./node_modules
USER node
CMD ["node", "dist/server.js"]
```

- Stage `deps` only invalidates when `package*.json` changes.
- npm cache mount survives between builds.
- Source changes only re-run the `build` stage.

### CI-shared cache via registry

```yaml
# .github/workflows/build.yml
- uses: docker/build-push-action@v5
  with:
    push: true
    tags: myorg/myapp:${{ github.sha }}
    cache-from: type=registry,ref=myorg/myapp:cache
    cache-to: type=registry,ref=myorg/myapp:cache,mode=max
```

First run populates `myorg/myapp:cache`. Every subsequent run on any runner reuses it. Build times drop dramatically.

Markdown · drag & drop images · ⌘B / ⌘I shortcuts1296 words

For the reviewer

Note to the moderator (optional)

Visible only to the moderator. Helps review go faster.