How to reduce the size of a Docker image?
Reducing Docker image size is part hygiene, part architecture. The right techniques applied to the right problem can shrink an image from 1 GB to 30 MB without losing functionality. Smaller images mean faster pulls, faster deploys, smaller attack surface.
Theory
TL;DR
Five techniques, in approximate order of impact:
- Multi-stage build with a slim final base (
alpine,distroless,scratch). The single biggest win. - Smaller base image: Debian slim → Alpine → distroless → scratch. Each step ~50-100 MB smaller.
- Single
RUNfor install + cleanup so cache files do not get baked into a layer. .dockerignoreto keep build context small (nonode_modules,.git, etc.).- Strip dev dependencies, recommended packages, and unused locales in the runtime stage.
Measure with docker images and docker history. For deep analysis, use dive.
Quick example: before and after
Before (single stage, naive):
FROM node:22
WORKDIR /app
COPY . .
RUN npm install
RUN npm run build
CMD ["npm", "start"]Final: ~1.2 GB.
After (multi-stage, alpine, prune):
FROM node:22-alpine AS build
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
RUN npm prune --omit=dev
FROM node:22-alpine
WORKDIR /app
COPY /app/dist /app/dist
COPY /app/node_modules /app/node_modules
USER node
CMD ["node", "dist/server.js"]Final: ~180 MB. Same functionality.
For static sites (no Node runtime needed):
# Stage 2:
FROM nginx:1.27-alpine
COPY /app/dist /usr/share/nginx/htmlFinal: ~30 MB.
Technique 1: multi-stage with a slim final base
See the dedicated multi-stage article. Bottom line: the toolchain is the heaviest thing in your image; multi-stage is how you leave it behind.
Base image options for the final stage, in order of size:
| Base | Approx size | Has shell? | Has package manager? |
|---|---|---|---|
debian:bookworm | 120 MB | Yes (bash) | apt |
debian:bookworm-slim | 75 MB | Yes (bash) | apt |
ubuntu:24.04 | 80 MB | Yes (bash) | apt |
alpine:3.21 | 7-8 MB | Yes (sh, busybox) | apk |
gcr.io/distroless/base | 20 MB | No | No |
gcr.io/distroless/static | 2 MB | No | No |
scratch | 0 | No | No |
Pick the smallest that has what your binary actually needs.
Technique 2: combine RUN commands and clean cache
# WRONG: each RUN is a layer; apt cache survives in layer 2
RUN apt-get update
RUN apt-get install -y curl
RUN rm -rf /var/lib/apt/lists/*
# RIGHT: one layer, cache deleted in same step
RUN apt-get update && \
apt-get install -y --no-install-recommends curl && \
rm -rf /var/lib/apt/lists/*The wrong version saves no space — layer 2 holds the apt cache, layer 3 only adds whiteout markers (the cache files are still on disk).
Apply the same pattern to:
apk(Alpine):apk add --no-cache <pkg>(auto-cleans)pip:pip install --no-cache-dir <pkg>npm:npm ci --only=production && npm cache clean --force
Technique 3: .dockerignore
Anything in your build context gets sent to the daemon, slowing builds and bloating layers. A typical .dockerignore:
.git
node_modules
dist
*.log
.env*
Dockerfile*
README.md
coverage
.vscode
.ideaWithout this, a COPY . . ships gigabytes you do not need.
Technique 4: drop dev dependencies and recommended packages
# Node
RUN npm ci --omit=dev
# Python
RUN pip install --no-cache-dir --prefix=/install <pkgs>
# Then in final stage, COPY only /install
# Go: nothing to do (binary is self-contained)
# apt with --no-install-recommends
RUN apt-get install --no-install-recommends -y curlDev deps (TypeScript compiler, jest, eslint) often double node_modules. --no-install-recommends cuts apt's optional packages.
Technique 5: minimize what gets COPIED
# Granular copies are smaller AND better for caching
COPY package*.json ./ # only lockfiles → install
RUN npm ci
COPY src/ ./src/ # only what runtime needs
COPY public/ ./public/Vs. COPY . . which copies tests, docs, IDE config, build outputs.
Inspecting and finding the bloat
# Per-layer sizes
$ docker history --no-trunc myimage
IMAGE CREATED CREATED BY SIZE
4f06b3e2c0c1 2 minutes ago /bin/sh -c #(nop) CMD ["node" "server.js"] 0B
<missing> 2 minutes ago /bin/sh -c npm prune --omit=dev 156MB ← attack this
<missing> 3 minutes ago /bin/sh -c npm run build 34MB
<missing> 4 minutes ago /bin/sh -c npm ci 312MB ← biggest culprit
...
# Interactive layer-by-layer view
$ dive myimage
# Shows each layer's added/removed/total bytes, file tree per layer.dive is the gold standard for understanding why an image is what it is.
Common mistakes
Adding files in one layer, deleting in another
# WRONG: 200 MB still in layer N, layer N+1 just hides it
ADD bigfile.tar.gz /tmp/
RUN unpack-and-process /tmp/bigfile.tar.gz
RUN rm -rf /tmp/* # whiteout, but the data is in layer N forever
# RIGHT: do it all in one layer
RUN mkdir -p /tmp/x && \
curl -L https://... | tar xz -C /tmp/x && \
process /tmp/x && \
rm -rf /tmp/xLayers are immutable. Once a file lands in a layer, no later layer can shrink the image — only the original layer can avoid having the file.
Using apt without --no-install-recommends
Debian's apt installs "recommended" packages by default. For a server image, almost none are needed. Always:
RUN apt-get update && \
apt-get install --no-install-recommends -y curl && \
rm -rf /var/lib/apt/lists/*Picking Debian when Alpine works
Most language runtimes have an Alpine variant: node:22-alpine, python:3.13-alpine, golang:1.23-alpine. They are usually 70-80% smaller. Caveat: Alpine uses musl libc, not glibc — some prebuilt binaries (NumPy with Intel MKL, some Node native modules) do not work on Alpine. When that bites, use *-slim Debian variants.
Forgetting to pin latest and getting bigger images by accident
node:latest might be 1 GB; node:22-alpine is 200 MB. Picking the right tag is half the battle.
Real-world usage
- Static site distribution:
nginx:alpinefinal stage → 25-30 MB. Industry standard. - Go services:
FROM scratch+ binary → 5-15 MB. Serverless-fast cold starts. - Python ML services:
python:3.13-slim+ only required packages, with--no-cache-direverywhere → 200-500 MB instead of 2 GB. - CI build images: the one place where size matters less; they live on the runner. But still, a 5 GB CI image slows every job.
Follow-up questions
Q: Does compressing my files reduce image size?
A: Not really — Docker layers are already gzipped on push/pull. Your work is at the file level, not compression.
Q: Why is my image so much bigger than the sum of files inside?
A: Because of how layers work — files added then deleted still take space. Use dive or docker history to find the bloat.
Q: Should I use Alpine for everything?
A: Most things, yes. Exceptions: Python ML/data-science (NumPy, SciPy, pandas have prebuilt wheels for glibc; Alpine forces musl-compatible builds, slow), heavy native dependencies. For these, *-slim Debian is a better default.
Q: What is the difference between distroless and Alpine?
A: Alpine has busybox, sh, apk — small but not minimal. Distroless has only the runtime your language needs (Node, Python, JVM, or none for static). No shell, no package manager, no anything. Smaller and more secure than Alpine; harder to debug (no docker exec sh).
Q: (Senior) When does aggressive size reduction become counterproductive?
A: When debugging in production becomes impossible (no shell, no tools). Use a separate :debug variant for that. When build complexity skyrockets (10-stage Dockerfiles with custom apk repositories) for marginal gains. When the squeezed image breaks at runtime because some lib was missing. Find the sweet spot: small enough to pull fast and minimize attack surface, big enough to debug when needed.
Examples
Static site: 1.2 GB → 28 MB
# BEFORE (1.2 GB)
FROM node:22
WORKDIR /app
COPY . .
RUN npm install
RUN npm run build
CMD ["npx", "http-server", "dist"]
# AFTER (28 MB)
FROM node:22-alpine AS build
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
FROM nginx:1.27-alpine
COPY /app/dist /usr/share/nginx/htmlPython ML service: 2.5 GB → 480 MB
# AFTER
FROM python:3.13-slim AS build
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir --prefix=/install -r requirements.txt
FROM python:3.13-slim
WORKDIR /app
COPY /install /usr/local
COPY app.py .
USER 1000:1000
CMD ["python", "app.py"]Key moves: slim base, --no-cache-dir, isolated install via prefix and copy.
Go service: 700 MB → 12 MB
FROM golang:1.23-alpine AS build
WORKDIR /src
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 go build -ldflags="-s -w" -o /out/server ./cmd/server
FROM scratch
COPY /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
COPY /out/server /server
USER 65532:65532
ENTRYPOINT ["/server"]-ldflags="-s -w" strips Go binary debug symbols. FROM scratch adds nothing. The binary plus a TLS cert bundle is the entire image.
Short Answer
Interview readyA concise answer to help you respond confidently on this topic during an interview.
Comments
No comments yet