Suggest an editImprove this articleRefine the answer for “What is a Dockerfile?”. Your changes go to moderation before they’re published.Approval requiredContentWhat you’re changing🇺🇸EN🇺🇦UAPreviewTitle (EN)Short answer (EN)**A Dockerfile** is a plain text file with build instructions Docker reads top-to-bottom to produce an image. Each instruction adds a layer. ```dockerfile FROM node:22-alpine WORKDIR /app COPY package*.json ./ RUN npm ci --omit=dev COPY . . USER node CMD ["node", "server.js"] ``` ```bash $ docker build -t myapp:1.0 . ``` **Key:** the Dockerfile is the recipe, `docker build` runs it, the result is an image. Order matters - cheap stable layers first, frequently-changing layers last, so cache is reused on rebuild.Shown above the full answer for quick recall.Answer (EN)Image**A Dockerfile** is a plain text file containing instructions that Docker reads top-to-bottom to assemble an image. Each non-trivial instruction creates a new layer; the layers stack to form the final image. ## Theory ### TL;DR - Plain text. No JSON, no YAML. One instruction per line in uppercase: `FROM`, `RUN`, `COPY`, `CMD`, etc. - Goes top-to-bottom. Earlier instructions land in lower layers; later ones add on top. - Each instruction = one layer (cached). Same instruction with the same input on rebuild = cache hit, no work. - **Order matters**: put stable, expensive things (system deps) early, frequently-changing things (your source code) late. - Multi-stage builds let you build in one stage and copy only the artifact into a slim runtime stage. Smaller, safer images. ### Quick example ```dockerfile # Dockerfile - typical Node.js app FROM node:22-alpine WORKDIR /app COPY package*.json ./ RUN npm ci --omit=dev COPY . . USER node EXPOSE 3000 CMD ["node", "server.js"] ``` ```bash $ docker build -t myapp:1.0 . [+] Building 12.4s (10/10) FINISHED => [1/6] FROM node:22-alpine => [2/6] WORKDIR /app => [3/6] COPY package*.json ./ => [4/6] RUN npm ci --omit=dev => [5/6] COPY . . => [6/6] USER node => exporting layers ``` Seven instructions, six layers (the last `EXPOSE` and `CMD` are metadata only). Change a source file and rebuild: only steps 5 and after re-execute. Steps 1-4 are pulled from cache. ### Key instructions | Instruction | What it does | |---|---| | `FROM image[:tag]` | Sets the base image. First non-comment line of every Dockerfile. | | `WORKDIR /path` | Sets the working directory for following `RUN`, `COPY`, `CMD`. Creates the dir if missing. | | `COPY src dest` | Copies files from build context into the image. | | `ADD src dest` | Like `COPY` but also handles URLs and tar extraction. **Prefer `COPY`** unless you need those features. | | `RUN cmd` | Runs a shell command at build time. Common: install packages, build artifacts. | | `ENV KEY=value` | Sets an environment variable that persists in the image. | | `EXPOSE 80` | Documentation only - says "this image listens on port 80". Does not actually publish anything. | | `USER name\|uid` | Sets the user for following instructions and the running container. Default: root (avoid). | | `CMD ["prog", "arg"]` | Default command when a container starts. Can be overridden by `docker run`. | | `ENTRYPOINT ["prog"]` | The fixed first part of the command; `CMD` becomes its default args. | | `ARG name` | Build-time variable, set with `--build-arg`. Not present at runtime (use `ENV` for that). | ### CMD vs ENTRYPOINT Both define what runs when a container starts. The difference matters when users override. ```dockerfile # Pattern A: CMD only CMD ["echo", "hello"] # docker run myimage -> echo hello # docker run myimage echo bye -> echo bye (CMD fully replaced) # Pattern B: ENTRYPOINT + CMD ENTRYPOINT ["echo"] CMD ["hello"] # docker run myimage -> echo hello # docker run myimage bye -> echo bye (CMD replaced, ENTRYPOINT stays) ``` Use `ENTRYPOINT` when the image is one tool (e.g., a CLI). Use `CMD` alone when the image is a service that takes no args. ### Build cache and instruction order Docker caches each layer by its instruction + inputs. Reorder for cache efficiency: ```dockerfile # WRONG: source copied before deps installed FROM node:22-alpine WORKDIR /app COPY . . # any code change invalidates everything below RUN npm ci --omit=dev # re-runs on every code change CMD ["node", "server.js"] # RIGHT: deps installed before source copied FROM node:22-alpine WORKDIR /app COPY package*.json ./ # only changes when deps change RUN npm ci --omit=dev # cached unless package.json changed COPY . . # changes when source changes; only this and below re-run CMD ["node", "server.js"] ``` The difference: rebuild after a one-line code change in `server.js` becomes 1 second instead of 60 seconds. ### Multi-stage builds Build in a fat stage, copy artifacts into a slim stage. Result: smaller, more secure final images. ```dockerfile # Stage 1: build FROM node:22-alpine AS build WORKDIR /app COPY package*.json ./ RUN npm ci COPY . . RUN npm run build # produces /app/dist # Stage 2: runtime FROM nginx:1.27-alpine COPY --from=build /app/dist /usr/share/nginx/html EXPOSE 80 ``` The final image is `nginx:1.27-alpine` plus your built static files. The Node toolchain, source code, `node_modules` - none of it lands in the runtime image. Smaller attack surface, smaller image, faster pull. ### Common mistakes **Running as root in the final stage** ```dockerfile # WRONG: default user is root FROM node:22 COPY . /app CMD ["node", "app.js"] # RIGHT: drop privileges FROM node:22 COPY --chown=node:node . /app USER node CMD ["node", "app.js"] ``` A root container that escapes its namespace is still root on the host. Always switch to a non-root user before `CMD`. **Not using `.dockerignore`** ``` # .dockerignore node_modules .git dist *.log .env* Dockerfile ``` Without it, `COPY . .` ships your `node_modules` and `.git` to the daemon, slowing builds and bloating the image. **Combining unrelated `RUN` commands incorrectly** ```dockerfile # WRONG: each RUN is a layer; this creates three layers and leaves apt cache in image RUN apt-get update RUN apt-get install -y curl RUN rm -rf /var/lib/apt/lists/* # RIGHT: one layer, cache cleaned in same step RUN apt-get update && \ apt-get install -y curl && \ rm -rf /var/lib/apt/lists/* ``` If you delete files in a later layer, the earlier layer still contains them - the deletion just hides them. Clean up in the same `RUN` that created the mess. **Using `ADD` when `COPY` is enough** `ADD` extracts tarballs and fetches URLs. Both behaviors surprise people. Use `COPY` for plain file copies; reach for `ADD` only when you actually need its extra features. ### Real-world usage - **CI/CD pipelines:** every PR triggers `docker build` against the repo's Dockerfile. Cache hit rates of 80-90 percent on well-ordered Dockerfiles keep builds fast. - **Multi-arch builds:** `docker buildx build --platform linux/amd64,linux/arm64 -t myapp:1.0 .` produces a multi-platform image from one Dockerfile. Used when the same app deploys to x86 servers and ARM (Mac M-series, Graviton). - **Distroless / scratch images:** `FROM gcr.io/distroless/base` or `FROM scratch` for the final stage of a multi-stage build. Final image contains only your binary - no shell, no package manager, no attack surface beyond the app itself. - **BuildKit features:** `# syntax=docker/dockerfile:1.7` at the top unlocks features like `RUN --mount=type=cache,target=/root/.npm` for persistent npm cache across builds. ### Follow-up questions **Q:** What is the difference between `RUN`, `CMD`, and `ENTRYPOINT`? **A:** `RUN` runs at build time and bakes its result into a layer. `CMD` and `ENTRYPOINT` run at container start time and define the default process. Build vs run is the dividing line. **Q:** Why do my builds keep redownloading dependencies? **A:** Probably because you run `COPY . .` before installing deps. Any change to any file invalidates the cache for that line and everything after, including the install step. Move the dep install up - copy lock files first, install, then copy the rest. **Q:** What is BuildKit and do I need it? **A:** BuildKit is the modern build engine for Docker (default since Docker 23). It enables parallel stage builds, cache mounts, secret mounts, and the `# syntax=docker/dockerfile:1.x` directive that adds new instructions. You almost always already have it. Run `docker buildx version` to confirm. **Q:** When should I use `ARG` vs `ENV`? **A:** `ARG` for build-time-only values (e.g., `--build-arg VERSION=1.2.3` to tag the build). `ENV` for runtime values that should be visible inside the running container (e.g., `ENV NODE_ENV=production`). `ARG` values disappear after build; `ENV` values persist. **Q:** (Senior) How do you handle secrets at build time without leaking them into a layer? **A:** Use BuildKit secret mounts: `RUN --mount=type=secret,id=npmrc cp /run/secrets/npmrc ~/.npmrc && npm ci`. The secret is available to the `RUN` step but never written to a layer. Pass it with `docker buildx build --secret id=npmrc,src=$HOME/.npmrc .`. Build args (`ARG`) leak into image history and should never carry secrets. ## Examples ### Multi-stage build for a Go service ```dockerfile # Stage 1: build FROM golang:1.23-alpine AS build WORKDIR /src COPY go.mod go.sum ./ RUN go mod download COPY . . RUN CGO_ENABLED=0 go build -o /out/server ./cmd/server # Stage 2: runtime - nothing but the binary FROM scratch COPY --from=build /out/server /server EXPOSE 8080 USER 65532:65532 ENTRYPOINT ["/server"] ``` Final image is roughly the size of the Go binary. No shell, no libc, no package manager. The only thing an attacker can interact with is your service. ### Python app with cache mount (BuildKit) ```dockerfile # syntax=docker/dockerfile:1.7 FROM python:3.13-slim WORKDIR /app COPY requirements.txt ./ RUN --mount=type=cache,target=/root/.cache/pip \ pip install --no-cache-dir -r requirements.txt COPY . . CMD ["python", "app.py"] ``` The cache mount keeps pip's wheels cached **across builds** without baking them into the image. Build #2 with the same `requirements.txt` reuses the cache; the layer itself stays clean.For the reviewerNote to the moderator (optional)Visible only to the moderator. Helps review go faster.