What is Docker and why do we need it?

docs.questions.sections.docker~5 min read

Docker is an open platform that packages an application and its dependencies into an isolated unit called a container, so the same software runs identically on a laptop, a CI runner, and a production server.

Theory

TL;DR

Docker = packaging format + runtime: build once, run anywhere a Docker engine is installed
A container is a process isolated from the host using Linux kernel features (namespaces and cgroups), not a tiny VM
Solves the "works on my machine" class of bugs by shipping the runtime alongside the code
Industry standard since Docker Engine 1.0 in 2014; current stable line is 29.x (April 2026)
Reach for it when teams need environment parity, fast deploys, or service isolation. Skip it for static sites and one-shot scripts

Quick example

bash

$ docker run -p 8080:80 nginx
Unable to find image 'nginx:latest' locally
latest: Pulling from library/nginx
e4fff0779e6d: Pull complete
2c5c0f9c49b1: Pull complete
Status: Downloaded newer image for nginx:latest
2026/04/30 10:24:01 [notice] 1#1: nginx/1.27.4

That one command pulled a pre-built nginx image, started it isolated from your host, and mapped port 80 inside the container to port 8080 on your laptop. No nginx install, no config files, no leftover state when you stop it.

Why teams actually adopted it

Before Docker, deploying a service meant a runbook: install Node 16, install libpng, copy these env vars, run this migration. Each environment drifted independently. Docker collapses all of that into a single artifact called an image that you build once and run everywhere identically. The line that gets repeated in every Docker talk, "works on my machine", stopped being a joke when people realized the image you tested locally is bit-for-bit the same image that runs in production.

When to reach for Docker

Multiple services share a host but need different runtime versions (Python 3.11 service and Python 3.13 service on one box)
CI must reproduce the production environment exactly
A new developer should be able to clone the repo and bring up the full stack with one command (docker compose up)
You ship to multiple cloud providers and don't want to be locked into AWS-specific deploy tooling

When you don't need it

A static site that goes to a CDN gains nothing from a container. A short shell script you run once does not need to be wrapped. Embedded systems with strict memory limits will feel the 30 MB of container overhead. The break-even is usually two or three services with different runtimes, or a CI pipeline that needs reproducibility. Below that, the complexity costs more than the consistency saves.

How it actually runs

Docker is not a virtual machine. When you run docker run nginx, the Docker daemon (dockerd) asks the container runtime (containerd, then runc) to start an nginx process on the host kernel. That process gets its own view of the filesystem, network interfaces, and process IDs through Linux namespaces. CPU and memory limits come from cgroups. The nginx process thinks it is alone on the machine; the host kernel knows it is just another process with extra restrictions. That is also why a container starts in milliseconds while a VM takes tens of seconds.

Common mistakes

Treating a container as a long-lived server

bash

# WRONG: shelling in to install packages
docker exec -it mycontainer apt-get install vim
# Those changes vanish when the container restarts.

# RIGHT: bake what you need into the Dockerfile
FROM nginx:1.27
RUN apt-get update && apt-get install -y vim

A container is meant to be replaceable. Anything you change inside a running one is lost on restart, unless it is in a volume.

Mixing image and container in conversation

bash

$ docker images       # lists images (templates)
$ docker ps           # lists running containers (instances)
$ docker rm <id>      # removes a container
$ docker rmi <id>     # removes an image

Image is the blueprint. Container is the running thing. They have separate commands, and confusing them costs new users their first hour.

Running everything as root

dockerfile

# WRONG: default user is root, container escapes get scarier
FROM node:22
COPY . /app
CMD ["node", "app.js"]

# RIGHT: drop privileges
FROM node:22
COPY . /app
USER node
CMD ["node", "app.js"]

A container shares the host kernel. A root process inside is still a root process if it escapes the namespace.

Real-world usage

Netflix: developers build and test microservices in Docker containers, then deploy to AWS. Image identity from laptop to production removed an entire class of deploy regression bugs.
Spotify: ran a homegrown orchestrator (Helios) on Docker for years before moving to Kubernetes. Service creation that used to take an hour dropped to minutes after the migration.
PayPal, Uber, Airbnb: Docker as the standard CI artifact. PR build produces an image, integration tests run against it, the image gets pushed to a registry and deployed.
Local development: docker compose up brings up Postgres, Redis, and the app at once instead of brew install for five things and remembering versions.

Once I spent two days chasing a Node 16 vs Node 14 mismatch between dev and prod that disappeared the day we wrapped both in identical Docker images. That class of bug is what Docker is for.

Follow-up questions

Q: If Docker is not a VM, why does it feel like one?

A: Because the isolation is good enough that processes inside cannot see anything outside. But it is a process on your host kernel, not a guest operating system. That is why containers start in milliseconds while VMs take tens of seconds.

Q: What does Docker actually install on my machine?

A: The Docker Engine. That includes dockerd (background daemon), docker (CLI client), containerd (the actual container runtime), and a handful of CLI helpers. On Mac and Windows it bundles a small Linux VM, because containers need a Linux kernel.

Q: Is Docker the same as Kubernetes?

A: No. Docker builds and runs single containers on one host. Kubernetes orchestrates many containers across many hosts and handles things like restarts, rolling updates, and service discovery. Most production teams use both.

Q: Why do some teams use Podman or nerdctl instead of docker?

A: Same OCI image format, same workflows, but no daemon (Podman) or a thinner stack (nerdctl runs directly on containerd). It matters in regulated environments where running a long-lived root daemon is itself a security concern.

Q: (Senior) When does Docker become the wrong tool?

A: When the overhead and complexity outweigh the consistency win. A static site behind a CDN gains nothing. A team with one Go binary and one server gains very little. The break-even is usually two or three services with different runtimes, or a CI pipeline that needs reproducibility. Below that, you are paying for capability you will not use.

Examples

Running a one-off database for local development

bash

$ docker run -d --name pg \
    -e POSTGRES_PASSWORD=devpass \
    -p 5432:5432 \
    postgres:16

# Connect from your app
$ psql -h localhost -U postgres

No Postgres install on the host, no service to manage. Stop and remove with docker rm -f pg when done. Want Postgres 17 next month? Change 16 to 17. No conflict, no leftover data unless you ask for it.

Packaging your own Node app

dockerfile

# Dockerfile
FROM node:22-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --omit=dev
COPY . .
USER node
EXPOSE 3000
CMD ["node", "server.js"]

bash

$ docker build -t myapp:0.1 .
$ docker run -p 3000:3000 myapp:0.1

Now any teammate runs git pull && docker build -t myapp:0.1 . && docker run -p 3000:3000 myapp:0.1 and gets the same Node version, the same dependencies, the same runtime as production. The Dockerfile is the build recipe; the image is the result; the container is the running instance.

Short Answer

Interview ready

Premium

A concise answer to help you respond confidently on this topic during an interview.

Comments

No comments yet