Suggest an editImprove this articleRefine the answer for “How to set up a health check for a Docker container?”. Your changes go to moderation before they’re published.Approval requiredContentWhat you’re changing🇺🇸EN🇺🇦UAPreviewTitle (EN)Short answer (EN)**A health check** is a periodic command Docker runs inside the container; if it returns 0, the container is `healthy`. Set it in the Dockerfile (`HEALTHCHECK`) or at run time (`--health-cmd`) or in Compose (`healthcheck:`). ```dockerfile HEALTHCHECK --interval=30s --timeout=3s --retries=3 \ CMD curl -f http://localhost:8080/health || exit 1 ``` ```yaml healthcheck: test: ["CMD", "curl", "-f", "http://localhost:8080/health"] interval: 30s timeout: 3s retries: 3 start_period: 10s ``` **Key:** state shows up in `docker ps` (`healthy`, `unhealthy`, `starting`). Compose `depends_on: condition: service_healthy` waits for it. Orchestrators use it to decide when to route traffic and when to restart.Shown above the full answer for quick recall.Answer (EN)Image**Container health checks** are how Docker (and Compose, Swarm, K8s) tell the difference between "the process is up" and "the app is actually working". Without a healthcheck, the only signal is "is PID 1 alive?", which misses every interesting failure mode. ## Theory ### TL;DR - A healthcheck is a command Docker runs inside the container periodically. Exit 0 = healthy; non-zero = unhealthy. - Three states: **starting** (still in `start_period`), **healthy**, **unhealthy** (failed `retries` times in a row). - Set via `HEALTHCHECK` in Dockerfile, `--health-cmd` on `docker run`, or `healthcheck:` in Compose. - Used by `docker ps`, by Compose `depends_on: service_healthy`, by Swarm to decide replica replacement. - Common command: `curl -f http://localhost:<port>/health`. Picky details: command must exist inside the container. ### Quick example ```dockerfile FROM node:22-alpine WORKDIR /app COPY . . RUN npm ci --omit=dev EXPOSE 3000 HEALTHCHECK --interval=30s --timeout=3s --start-period=10s --retries=3 \ CMD wget --quiet --tries=1 --spider http://localhost:3000/health || exit 1 CMD ["node", "server.js"] ``` ```bash $ docker run -d --name api myapp $ docker ps CONTAINER ID IMAGE STATUS NAMES a3f9d2b8c1e4 myapp Up 30 seconds (healthy) api # Status now includes (healthy) / (unhealthy) / (starting) ``` After ~30 seconds, the first check runs. If it succeeds, status flips to `(healthy)`. ### The four flags that matter ``` --interval=DURATION # how often to run the check (default 30s) --timeout=DURATION # max time the check has to return (default 30s) --retries=N # how many failures before unhealthy (default 3) --start-period=DURATION # grace period at startup; failures here do not count (default 0s) ``` For a typical web service: `--interval=30s --timeout=3s --retries=3 --start-period=10s` works well. Apps that take 30+ seconds to warm up (JVMs, big Python ML services) need a longer `--start-period` (60-120s). ### Compose syntax ```yaml services: api: image: myapp healthcheck: test: ["CMD", "curl", "-f", "http://localhost:3000/health"] interval: 30s timeout: 3s retries: 3 start_period: 10s web: image: nginx depends_on: api: condition: service_healthy # wait until api is healthy before starting ``` `depends_on: condition: service_healthy` is the killer feature — it waits for the dep's healthcheck to pass before starting dependents. Far more robust than the simple list form. ### Three forms of the test command ```yaml # Form 1: CMD (preferred — no shell) test: ["CMD", "curl", "-f", "http://localhost:3000/health"] # Form 2: CMD-SHELL (with shell, lets you use && || env var expansion) test: ["CMD-SHELL", "curl -f http://localhost:3000/health || exit 1"] # Form 3: NONE (disable inherited healthcheck from base image) test: ["NONE"] ``` `CMD` form is faster (no shell process). `CMD-SHELL` is needed for env var expansion or shell logic. ### Common failure: command not found in container Most-common health-check bug: ```dockerfile HEALTHCHECK CMD curl -f http://localhost:8080/health || exit 1 ``` but the image is `alpine` without `curl` installed. Healthcheck always fails. Solutions: - `RUN apk add --no-cache curl` (or for Alpine, often `wget` is already there: `wget -q --spider URL`) - For Node apps: `node -e "require('http').get('http://localhost:3000/health', r => process.exit(r.statusCode===200?0:1))"` (no extra package needed) - Distroless images often need a binary built in: include a `/healthcheck` binary in your Dockerfile that exits 0/1 ### Inspecting health ```bash # Current status in ps docker ps --format 'table {{.Names}}\t{{.Status}}' # Full health details docker inspect api --format '{{json .State.Health}}' | jq # { # "Status": "healthy", # "FailingStreak": 0, # "Log": [ # { "Start": "...", "End": "...", "ExitCode": 0, "Output": "..." }, # ... # ] # } # Watch live watch 'docker ps --format "table {{.Names}}\t{{.Status}}"' ``` The `Log` array keeps the last 5 healthcheck results — invaluable for debugging "why is this unhealthy?". ### Common mistakes **No `start_period`, healthchecks fail during slow startup** ```yaml # WRONG: app needs 60s to warm up; first 3 fails make it unhealthy healthcheck: test: ["CMD", "curl", "-f", "http://localhost:8080/health"] interval: 5s retries: 3 # After 15s the container is unhealthy and might get restarted # RIGHT: start_period gives a grace window healthcheck: test: ["CMD", "curl", "-f", "http://localhost:8080/health"] interval: 30s start_period: 60s # failures during this window do not count ``` **Healthcheck that depends on a dependency** ```yaml # WRONG: api healthcheck pings db; if db is down briefly, api goes unhealthy test: ["CMD", "sh", "-c", "curl -f http://localhost:3000/health && pg_isready -h db"] ``` Mix dependencies into your *liveness* check and a transient db blip restarts your api. Better: healthcheck only checks the container's own readiness; have a separate /readiness endpoint if your app needs to gate traffic on dependency health. **Hitting external URL in healthcheck** ```dockerfile HEALTHCHECK CMD curl -f https://api.example.com/health || exit 1 ``` Now your container's health depends on someone else's uptime. Don't. **Disabling healthcheck without realizing it** ```dockerfile FROM postgres:16 # inherits postgres' healthcheck. If your app does not exit 0 there, you get unhealthy. # Either set your own: HEALTHCHECK CMD pg_isready -U postgres # Or disable inherited: HEALTHCHECK NONE ``` ### Real-world usage - **Compose with `depends_on: service_healthy`:** wait for db to be ready before starting api. The biggest practical use. - **Swarm orchestration:** unhealthy replicas are killed and replaced. Healthcheck is the signal. - **Reverse proxy integration:** Traefik and nginx-proxy can inspect Docker healthcheck state to route only to healthy containers. - **Monitoring dashboards:** scrape `docker inspect` output for health status, alert on `unhealthy`. ### Follow-up questions **Q:** What is the difference between Docker healthcheck and Kubernetes liveness/readiness probes? **A:** Same idea, different scope. K8s splits liveness (am I alive?) and readiness (am I ready for traffic?), with separate behaviors (liveness restarts; readiness removes from service). Docker has just one combined healthcheck. K8s does not use Docker's healthcheck — it has its own. **Q:** What signal does an unhealthy container get? **A:** None — being unhealthy does not auto-restart by itself. With `--restart=on-failure` it does not help (no exit code). With Swarm or Compose-with-orchestrator, the orchestrator decides to replace the unhealthy task. With plain `docker run`, you (or your monitor) act. **Q:** Can I have multiple healthchecks? **A:** Only one per container. Combine logic inside one CMD-SHELL command if needed. **Q:** How do I disable the inherited healthcheck from a base image? **A:** `HEALTHCHECK NONE` in your Dockerfile, or `test: ["NONE"]` in Compose. **Q:** (Senior) When should the healthcheck do more than `curl /health`? **A:** Add a smarter `/health` endpoint inside the app that checks downstream readiness (DB connection pool not exhausted, queue not backed up beyond a threshold). Keep the healthcheck command itself simple — the intelligence belongs in the endpoint, not in shell. For services with long warmup, expose `/livez` (always returns OK once the process is up) and `/readyz` (true app readiness) separately, and use a different K8s-style approach in Swarm via two endpoints. ## Examples ### Compose stack with healthcheck-gated startup ```yaml services: api: build: . depends_on: db: condition: service_healthy healthcheck: test: ["CMD-SHELL", "wget -q --spider http://localhost:3000/health || exit 1"] interval: 10s timeout: 3s retries: 3 start_period: 15s db: image: postgres:16 environment: POSTGRES_PASSWORD: dev healthcheck: test: ["CMD-SHELL", "pg_isready -U postgres"] interval: 5s retries: 5 start_period: 5s ``` ```bash $ docker compose up -d [+] Running 2/2 ✔ Container db Healthy 1.4s ✔ Container api Healthy 12.3s ``` Compose waits for db to be healthy before starting api. No race conditions, no "connection refused" on first run. ### Node app with built-in healthcheck (no extra packages) ```dockerfile FROM node:22-alpine WORKDIR /app COPY . . RUN npm ci --omit=dev EXPOSE 3000 HEALTHCHECK --interval=30s --timeout=3s --start-period=10s --retries=3 \ CMD node -e "require('http').get('http://localhost:3000/health', r => process.exit(r.statusCode === 200 ? 0 : 1))" CMD ["node", "server.js"] ``` No curl needed — uses the Node runtime that is already there. Smaller image, no extra package. ### Watching health logs ```bash $ docker inspect api --format '{{range .State.Health.Log}}{{.End}}: exit={{.ExitCode}} out={{.Output}}\n{{end}}' 2026-04-30T10:00:00Z: exit=0 out=ok 2026-04-30T10:00:30Z: exit=0 out=ok 2026-04-30T10:01:00Z: exit=1 out=connection refused 2026-04-30T10:01:30Z: exit=1 out=connection refused 2026-04-30T10:02:00Z: exit=0 out=ok ``` Last 5 results with exit codes and stdout. Often answers "why is/was this unhealthy?" without further investigation.For the reviewerNote to the moderator (optional)Visible only to the moderator. Helps review go faster.