What is a Docker image?

docs.questions.sections.docker~5 min read

A Docker image is an immutable, read-only template that packages an application together with everything it needs to run: code, runtime, system libraries, environment variables, and default configuration.

Theory

TL;DR

An image is a blueprint, not a running thing. Containers are running instances of an image.
Made of three parts: layers (the filesystem), a manifest (what layers and config to use), and a config blob (env, command, working dir, etc.).
Identified by a tag like nginx:1.27-alpine (mutable, can be moved) or a digest like sha256:4f06b3e2... (immutable, content-addressed).
You either pull one from a registry (docker pull) or build your own from a Dockerfile (docker build).
Standardized by the OCI Image Spec, so Docker, Podman, containerd, Kubernetes all read the same format.

Quick example

bash

# Pull a specific version of nginx
$ docker pull nginx:1.27-alpine
1.27-alpine: Pulling from library/nginx
9824c27679d3: Pull complete
8e1015e74a85: Pull complete
Digest: sha256:4f06b3e2c0c1e8e6a9d2c8f3e8d7a6b5c4...
Status: Downloaded newer image for nginx:1.27-alpine

# See it in your local cache
$ docker images
REPOSITORY   TAG           IMAGE ID       SIZE
nginx        1.27-alpine   4f06b3e2c0c1   54.9MB

The image is now sitting on your disk. It does nothing yet. To use it, you start a container from it: docker run nginx:1.27-alpine.

What is actually inside an image

Three pieces, all addressed by their content hash:

Layers - the filesystem contents, split into ordered tarballs. Each Dockerfile instruction usually produces one layer. Layers are deduplicated across images: if node:22-alpine and python:3.13-alpine both use the same Alpine base, that base layer lives on your disk only once.
Config blob - JSON describing how to run a container from this image: working directory, default command, environment variables, exposed ports, user, entrypoint.
Manifest - JSON that ties the above together: the list of layer digests, the config digest, the platform (linux/amd64, linux/arm64). The manifest itself has a digest, and that digest is your image's true identity.

When you pull, the daemon fetches the manifest first, then any layers and the config that you do not already have locally.

Tag vs digest

This trips up almost everyone the first time.

Tag is a name pointing at a manifest. It is mutable. Today nginx:latest points to manifest A; tomorrow Docker Inc pushes a new latest, and nginx:latest points to manifest B. Same name, different image.
Digest is a SHA256 hash of the manifest's content. It is immutable by definition. nginx@sha256:4f06b3e2... always means exactly the same bytes, forever.

bash

# Reproducible: pull by digest
$ docker pull nginx@sha256:4f06b3e2c0c1e8e6...
# Unreproducible: tag may have changed since you tested
$ docker pull nginx:latest

For production, pin to a digest or at least a specific version like nginx:1.27.4. :latest will surprise you eventually.

Build vs pull

Two paths to get an image into your local cache:

Pull - download a pre-built image from a registry (Docker Hub, ECR, GHCR, your own).

bash

$ docker pull postgres:16-alpine

Build - construct your own from a Dockerfile, layer by layer.

bash

$ docker build -t myapp:0.1 .

In a real workflow, your CI builds an image from your Dockerfile, tags it, and pushes it to a registry. Production hosts then pull it.

Image naming

The full form is [REGISTRY[:PORT]/]NAMESPACE/REPOSITORY[:TAG|@DIGEST]. A few examples:

nginx:1.27-alpine - shorthand. Defaults to Docker Hub registry, library namespace.
myorg/myapp:v2.3 - private repo on Docker Hub.
ghcr.io/myorg/myapp:v2.3 - GitHub Container Registry.
123456789.dkr.ecr.eu-west-1.amazonaws.com/myapp:v2.3 - AWS ECR.

If you do not specify a tag, Docker assumes :latest. That is convenience that bites in production.

Common mistakes

Confusing image with container

bash

$ docker images       # list IMAGES (templates)
$ docker ps -a        # list CONTAINERS (instances)
$ docker rmi <id>     # remove an image
$ docker rm  <id>     # remove a container

If docker rmi nginx complains "image is being used by stopped container", the image is the template and one container is still around using it. Remove the container first, then the image.

Trusting :latest in production

yaml

# WRONG: this can change overnight
image: nginx:latest

# RIGHT: pin to specific version
image: nginx:1.27.4
# BETTER: pin to digest
image: nginx@sha256:4f06b3e2c0c1...

A new :latest can introduce a breaking change between two deploys of the same code. The digest never lies.

Sending the entire repo as build context

dockerfile

# Without .dockerignore, this sends everything to the daemon:
COPY . /app

# .dockerignore - keep build context small
node_modules
.git
*.log
dist/

Docker uploads the build context (your current directory) to the daemon before building. Without a .dockerignore, you ship node_modules, .git, and gigabytes of dev artifacts to the daemon every build.

Mutating an image after build

You cannot. If you exec into a container and apt-get install something, the image is unchanged - those packages live in the writable layer of that one container, and disappear on restart. To bake them in, edit the Dockerfile and rebuild.

Real-world usage

Docker Hub - public registry hosting nginx, postgres, redis, node, and millions of community images. Default registry when you docker pull without specifying one.
AWS ECR / Google Artifact Registry / GitHub Container Registry - private registries used by most teams shipping production code. Same image format, different access controls.
Multi-architecture images - one tag like nginx:1.27 actually points to a manifest list (an "image index") containing manifests for linux/amd64, linux/arm64, linux/arm/v7. Your client picks the right one for your CPU automatically.
Cosign / Sigstore - cryptographic signing of image digests. Used in supply-chain-aware setups so production only deploys images signed by a trusted CI.

Follow-up questions

Q: Where are images actually stored on my machine?

A: In Docker's storage area, typically /var/lib/docker/overlay2/ on Linux. Each layer is its own directory. The path is private to the daemon - you do not interact with it directly, you use docker images and docker rmi.

Q: Are Docker images and OCI images the same thing?

A: Effectively yes. The OCI Image Specification was extracted from Docker's format and is now the open standard. Docker, Podman, containerd, and Kubernetes all read OCI images. "Docker image" and "OCI image" are used interchangeably in practice.

Q: Why do two images with the same content sometimes have different digests?

A: Because metadata in the config blob (build timestamps, build args) changes the bytes even when filesystem layers are identical. To get reproducible digests, build with --build-arg SOURCE_DATE_EPOCH=... and other reproducibility flags, and avoid embedding timestamps.

Q: What is the difference between an image and a manifest list?

A: An image manifest describes one image for one platform (say linux/amd64). A manifest list (OCI calls it an "image index") points to multiple manifests for different platforms. When you docker pull nginx:1.27, the daemon fetches the manifest list, picks the manifest for your CPU, then pulls those layers. Same tag, different actual bytes per platform.

Q: (Senior) How do you guarantee that the image you tested in CI is the exact image deployed to production?

A: Pin to digest, not tag. Your CI captures the digest after build (docker buildx imagetools inspect or the output of docker push), and your deployment manifest references image@sha256:.... That bypasses the tag mutability problem entirely. For supply-chain assurance, sign the digest with Cosign and verify the signature at admission time. Tag-based deploys are convenient but they cannot give you that guarantee.

Examples

Inspecting an image

bash

$ docker inspect nginx:1.27-alpine | head -30
[
    {
        "Id": "sha256:4f06b3e2c0c1...",
        "RepoTags": ["nginx:1.27-alpine"],
        "RepoDigests": ["nginx@sha256:abcd1234..."],
        "Architecture": "amd64",
        "Os": "linux",
        "Size": 54923456,
        "Config": {
            "Cmd": ["nginx", "-g", "daemon off;"],
            "ExposedPorts": { "80/tcp": {} },
            "WorkingDir": "",
            "Env": ["PATH=/usr/local/sbin:/usr/local/bin..."]
        }
    }
]

This dumps the image's manifest, config, and metadata. The Config section is what defines container defaults: when you docker run nginx:1.27-alpine with no overrides, you get exactly these.

Building a tiny image

dockerfile

# Dockerfile
FROM alpine:3.21
RUN apk add --no-cache curl
CMD ["curl", "--version"]

bash

$ docker build -t curl-tool:0.1 .
[+] Building 4.2s (6/6) FINISHED
 => [1/2] FROM alpine:3.21
 => [2/2] RUN apk add --no-cache curl
 => exporting to image

$ docker run --rm curl-tool:0.1
curl 8.10.1 (x86_64-alpine-linux-musl)

Three layers visible: the Alpine base, the apk add layer, and image metadata. Total size: under 10 MB. Same image, no install, no leftover dev tools.

Short Answer

Interview ready

Premium

A concise answer to help you respond confidently on this topic during an interview.

Finished reading?