Suggest an edit

Improve this article

Refine the answer for “How to debug problems in a Docker container?”. Your changes go to moderation before they’re published.

Approval required

Content

What you’re changing

Title (EN)

Short answer (EN)

Shown above the full answer for quick recall.

Answer (EN)

**Debugging Docker containers** is a flowchart: start with the most general signal (logs, exit code), narrow to specific (exec inside, entrypoint override), use specialized tools when nothing else works. The right order saves you hours.

## Theory

### TL;DR

The debugging flowchart, in order:

1. **`docker ps -a`** — what is the container's state?
2. **`docker logs --tail 200 <name>`** — what did it print?
3. **`docker inspect <name>`** — exit code, OOM flag, error message, mounts, networks.
4. **`docker exec -it <name> sh`** — if running, poke around live.
5. **`docker run -it --entrypoint sh <image>`** — if crashes too fast to attach.
6. **Override the broken thing:** `--entrypoint`, `--user root`, `--cap-add SYS_PTRACE` for strace.
7. **Specialized tools:** `dive` for layers, `tcpdump` for network, `docker stats` for resources.

### Step 1: state

```bash
$ docker ps -a --filter name=api
CONTAINER ID   IMAGE   STATUS                       NAMES
a3f9d2b8c1e4   myapp   Exited (1) 3 seconds ago     api
```

Three facts in one line: it crashed, exit code 1, 3 seconds ago. Most failure paths reveal themselves at this stage.

### Step 2: logs

```bash
# Recent output
docker logs --tail 200 api

# Live tail (for ongoing issues)
docker logs -f --tail 50 api

# Time-bounded
docker logs --since 10m api

# With timestamps
docker logs -t --tail 50 api
```

**Important:** `docker logs` reads PID 1's stdout/stderr. If your app logs to a file inside the container, this will be empty. The fix is to make the app log to stdout (12-factor norm).

For verbose output capture: `docker logs api > app.log 2>&1`.

### Step 3: inspect

The full JSON of the container's state. With `--format` you extract specific fields:

```bash
# Status quartet — usually answers "what happened?"
docker inspect api --format \
  '{{.State.Status}} (exit={{.State.ExitCode}}) OOM={{.State.OOMKilled}} Err={{.State.Error}}'
# exited (137) OOM=true Err=     ← OOM killer
# exited (1) OOM=false Err=      ← app error code 1
# exited (139) OOM=false Err=    ← segfault
```

```bash
# What is mounted?
docker inspect api --format '{{range .Mounts}}{{.Type}}: {{.Source}} -> {{.Destination}}\n{{end}}'

# What network?
docker inspect api --format '{{range $k, $v := .NetworkSettings.Networks}}{{$k}}: {{$v.IPAddress}}\n{{end}}'

# Health check log (last 5 attempts)
docker inspect api --format '{{json .State.Health}}'
```

### Step 4: exec inside (if running)

```bash
# Drop into a shell
docker exec -it api sh
# or bash if available
docker exec -it api bash

# Check files exist
docker exec api ls -la /app

# Check env
docker exec api env | grep DATABASE

# Check connectivity to a sibling
docker exec api wget -O- http://db:5432

# Check what the process is doing
docker exec api ps aux
```

### Step 5: entrypoint override (if crashing too fast)

If the container exits before you can `exec` in:

```bash
# Skip the actual app, drop into a shell
docker run -it --rm --entrypoint sh myimage

# Same but with the original env vars + volumes
docker run -it --rm \
    --entrypoint sh \
    -e DATABASE_URL=... \
    -v ./data:/data \
    myimage

# Or, run the original command but pause first to attach
docker run -it --rm --entrypoint /bin/sh myimage -c "sleep 3600 & node server.js"
```

`--entrypoint` swaps out the image's entrypoint with whatever you provide. Combined with `-it`, you get an interactive prompt instead of the dying app.

### Step 6: tactical overrides

```bash
# Run as root for diagnosis (default user might lack permissions)
docker exec -it -u root api sh

# Mount strace from host or install in image
docker run -it --cap-add=SYS_PTRACE myimage strace -p 1

# Disable healthcheck-driven restart loops
docker run --health-cmd=NONE ...

# Run with restart=no to keep failed state visible
docker run --restart=no ...

# Force IPv4 for DNS issues
docker run --dns=8.8.8.8 ...
```

### Distroless-specific debugging

Distroless images have no shell. To debug:

```bash
# Most projects publish a :debug variant with busybox
docker run -it --entrypoint sh gcr.io/distroless/base:debug

# Or copy a busybox into your debugging container
docker run -it --entrypoint /busybox/sh gcr.io/distroless/base:debug
```

For production distroless images: build a separate `:debug` tag with the same content + a busybox layer, deploy debug only when needed.

### Common mistakes

**Looking inside the writable layer for log files**

```bash
$ docker exec web cat /var/log/myapp.log
# (often empty or stale; the app should log to stdout)
```

Apps that log to files inside the container are not visible to `docker logs`. Either reconfigure the app to use stdout, or mount a volume for logs and read from the host.

**Running without `-t` and getting blank output**

```bash
$ docker exec api sh
# (sometimes hangs or exits immediately)
```

Need `-t` to allocate a TTY. Always `-it` for interactive shells.

**Forgetting that `docker stop` may have killed the app mid-cleanup**

If an app dies during `docker stop`, exit 143 (SIGTERM) or 137 (SIGKILL after grace), the logs may be truncated. Increase `--time` on stop or trap SIGTERM in your app.

**Mistaking docker-proxy errors for app errors**

```
Error starting userland proxy: listen tcp 0.0.0.0:8080: bind: address already in use
```

That is from Docker, not your app — port 8080 is already used on the host. Check `lsof -i :8080` or pick a different host port.

### Specialized debugging tools

```bash
# Image layer analysis
dive myimage         # interactive layer-by-layer view
docker history --no-trunc myimage   # who added what

# Network debugging
docker exec api tcpdump -i any -nn -c 20
docker exec api ss -tnlp     # listening ports
docker exec api ip route

# Process tracing
docker run --cap-add=SYS_PTRACE myimage strace -p 1

# Resource debugging
docker stats --no-stream
docker inspect --format '{{.HostConfig.Memory}}' container

# Live filesystem changes
docker diff <container>     # what changed in the writable layer
```

### Real-world usage

- **"My app exits immediately":** check exit code → check logs → exec in with `--entrypoint sh` to verify the binary exists and is executable.
- **"It works locally but not in CI":** compare env vars, mount paths, network. `docker inspect` is the rosetta stone.
- **"Container is unhealthy but I do not know why":** `docker inspect --format '{{json .State.Health}}'` shows the last 5 healthcheck attempts with their output.
- **"Out of memory crashes":** `docker inspect` for `OOMKilled: true`. Sized? Bumped recently? Memory leak? `docker stats` over time.
- **"Can't reach another container":** exec in, ping by service name, check `/etc/resolv.conf`, verify both containers are on the same network.

### Follow-up questions

**Q:** How do I get logs of a container that has been removed?

**A:** You cannot — `docker rm` deletes the container's log files. If you anticipate the need, configure a remote log driver (`json-file` log files survive container, but `syslog`/`journald`/`fluentd` ship logs offsite). Or always `docker logs > backup.log` before `rm`.

**Q:** What does exit code 125 mean?

**A:** Docker daemon error before the container started. Usually a Docker config issue (bad image, bad mount, port conflict). Look at the daemon log (`journalctl -u docker`).

**Q:** What does exit code 126 vs 127 mean?

**A:** 126 = command found but not executable (permissions or wrong arch). 127 = command not found at all. Often a typo or missing binary in the image.

**Q:** How do I debug a Compose service?

**A:** All the same commands work via `docker compose`: `docker compose logs api`, `docker compose exec api sh`, `docker compose run --rm api sh` for a one-off. The Compose wrappers know your project context.

**Q:** (Senior) How do you debug a container that crashes on the kubelet but works locally?

**A:** Reproduce the kubelet's exact run config: same image digest, same env, same entrypoint, same UID, same network policy. Pod spec → `docker run` flags is a known transformation. Often the difference is: K8s runs as a non-root UID by default, your local does not. Or K8s injects sidecars that block traffic. Or K8s mounts secrets/configmaps your local does not have. Use `kubectl debug` for an interactive container in the same pod context.

## Examples

### A complete debugging session

```bash
$ docker ps -a --filter name=api
STATUS                       NAMES
Exited (137) 5 seconds ago   api

$ docker logs --tail 50 api
... lots of output ...
ERROR: connect ECONNREFUSED 172.18.0.5:5432

$ docker inspect api --format '{{.State.OOMKilled}} {{.State.ExitCode}} {{.HostConfig.Memory}}'
false 137 536870912
# OOM=false, exit 137 → not memory; SIGKILL came from somewhere

$ docker logs --tail 50 db
2026-04-30 ... database system is shut down
# db exited; api could not connect → SIGTERM cascaded.

# Fix: add depends_on with healthcheck in compose, ensure db survives
```

Four commands triangulated the issue: api was killed because db went down first.

### Debugging an image that crashes immediately

```bash
$ docker run myimg
# (exits in 0.1 seconds)

$ docker run --rm myimg --version
# (also exits immediately, no output)

# Override entrypoint to investigate
$ docker run -it --rm --entrypoint sh myimg
/ # which myapp
/usr/local/bin/myapp
/ # ls -la /usr/local/bin/myapp
-rwxr-xr-x 1 root root 12M ...
/ # /usr/local/bin/myapp
Segmentation fault
# → wrong arch! Likely an x86 binary in an ARM image.

$ docker inspect --format '{{.Architecture}}' myimg
amd64
$ uname -m
aarch64
# Confirmed: image arch != host arch.
```

Without `--entrypoint sh`, the binary's segfault was invisible; `docker logs` had nothing because the process died before stdio.

### Healthcheck debugging

```bash
$ docker ps --format '{{.Names}}: {{.Status}}'
api: Up 5 minutes (unhealthy)

$ docker inspect api --format '{{range .State.Health.Log}}{{.End}}: exit={{.ExitCode}} out={{.Output}}\n{{end}}'
2026-04-30T10:00:00Z: exit=0 out=ok
2026-04-30T10:01:00Z: exit=0 out=ok
2026-04-30T10:02:00Z: exit=7 out=connect: connection refused
2026-04-30T10:03:00Z: exit=7 out=connect: connection refused
2026-04-30T10:04:00Z: exit=7 out=connect: connection refused
# → health endpoint stopped responding 3 minutes ago. App probably hung.

$ docker exec api ps aux
# Look for the main process; is it running but not responsive?
# If yes, app is stuck (deadlock, infinite loop). Capture a stack trace.
```

The healthcheck log is gold — five attempts with exit codes and output.

Markdown · drag & drop images · ⌘B / ⌘I shortcuts1634 words

For the reviewer

Note to the moderator (optional)

Visible only to the moderator. Helps review go faster.