Skip to main content

How to fix 'no space left on device' on a Docker host?

"No space left on device" is the most common Docker pain in production. Images accumulate, build cache piles up, container logs grow without bound, and volumes outlive the containers that created them. The fix is a mix of immediate cleanup and long-term hygiene.

Theory

TL;DR

  • Diagnose with docker system df before pruning. Know where space is going.
  • docker system prune -af --volumes reclaims everything not currently used. Safe in dev, careful in prod (drops detached volumes).
  • Logs are the silent killer: a chatty container can fill /var/lib/docker/containers/<id>/<id>-json.log to gigabytes.
  • Build cache can grow to tens of GB on busy CI hosts. Prune regularly.
  • Long-term: put /var/lib/docker on its own partition; enable log rotation in daemon.json; cron a prune.

Where space goes

Docker stores everything under /var/lib/docker (or wherever data-root points):

/var/lib/docker/ ├── overlay2/ # image layers + container writable layers ├── containers/<id>/ # container metadata + logs (json-file) ├── volumes/ # named volumes ├── image/ # image manifest metadata ├── buildkit/ # build cache (with BuildKit) └── tmp/ # transient

On a busy host, breakdown is typically:

  • 30-50% — image layers
  • 20-40% — anonymous/orphan volumes
  • 10-20% — container logs
  • 5-20% — build cache

docker system df shows this in human-readable form.

Categories of waste

CategoryWhat it isHow to clean
Stopped containersexited containers retained for docker logs/docker startdocker container prune -f
Dangling imagesimages with no tag (replaced by newer build)docker image prune -f
Unused imagesimages not referenced by any containerdocker image prune -af (note -a)
Anonymous volumesvolumes auto-created by VOLUME Dockerfile directive, never cleaneddocker volume prune -f (named volumes too if -a)
Build cacheBuildKit cache layers from past buildsdocker builder prune -af
Container logsjson-file logs that grew unboundedlog rotation config
Networksunused custom networks (small)docker network prune -f

Examples

Diagnostic flow

bash
# Step 1: how big is /var/lib/docker? sudo du -sh /var/lib/docker # 87G # Step 2: breakdown by Docker category docker system df # TYPE TOTAL ACTIVE SIZE RECLAIMABLE # Images 67 12 31.4GB 24.1GB (76%) # Containers 34 8 3.2GB 2.7GB # Volumes 28 5 45.0GB 38.0GB (84%) # Build Cache 2104 8.0GB 8.0GB # Step 3: drill into the worst offender docker system df -v # verbose: per-image, per-container, per-volume

The RECLAIMABLE column is your first target.

Quick win: prune everything safely

bash
# Stopped containers, dangling images, unused networks, build cache docker system prune -f # Reclaimed: 12.5GB

No --volumes flag means volumes are kept. Safe default.

Aggressive: reclaim everything not in use

bash
docker system prune -af --volumes # Includes: # - all images not used by a container (not just dangling) # - all volumes not mounted in any container

Dangerous in prod: a stopped container's volume is fine, but a volume that exists but happens not to be currently mounted (because the only container using it is being recreated) gets deleted. Use this in dev/CI, not prod.

Targeted commands

bash
# Images docker image prune -f # dangling only docker image prune -af # all images not used by a container # Containers docker container prune -f # stopped containers # Volumes docker volume prune -f # volumes not mounted in any container # Build cache docker builder prune -f # cache older than 24h, dangling docker builder prune -af # all build cache docker builder prune --filter until=168h # cache older than 7 days # Networks docker network prune -f

Container logs

Logs default to json-file driver with no size limit. A chatty app fills GBs.

bash
# See per-container log file sizes for c in $(docker ps -q); do name=$(docker inspect -f '{{.Name}}' $c | sed 's|/||') size=$(sudo du -sh /var/lib/docker/containers/$c/$c-json.log 2>/dev/null | cut -f1) echo "$size $name" done

Truncate a runaway log without restart:

bash
sudo truncate -s 0 /var/lib/docker/containers/<id>/<id>-json.log

Permanently fix by setting log limits in /etc/docker/daemon.json:

json
{ "log-driver": "json-file", "log-opts": { "max-size": "100m", "max-file": "3" } }

Then sudo systemctl restart docker. Existing containers keep old config until recreated; new containers inherit.

For production, use a log-shipper (Fluentd, Loki, syslog) so logs leave the host entirely.

Move /var/lib/docker to a bigger partition

If the OS partition is small (e.g., a DigitalOcean droplet with 25 GB):

bash
# Stop the daemon sudo systemctl stop docker # Mount a new disk at /mnt/docker sudo rsync -a /var/lib/docker/ /mnt/docker/ sudo mv /var/lib/docker /var/lib/docker.bak sudo ln -s /mnt/docker /var/lib/docker # (or update daemon.json with "data-root": "/mnt/docker") sudo systemctl start docker docker info | grep 'Docker Root Dir'

Verify, then rm -rf /var/lib/docker.bak.

Periodic cleanup via cron

bash
# /etc/cron.daily/docker-prune #!/bin/sh docker container prune -f docker image prune -f docker builder prune -f --filter until=72h # Don't include --volumes; volume cleanup needs manual review

Make executable: chmod +x /etc/cron.daily/docker-prune.

"No space left" during a build

During a docker build, the error often comes from BuildKit's intermediate layers, not the final image:

bash
docker builder prune -af # Frees the cache. Try the build again.

Or from /tmp (used for temporary downloads):

bash
df -h /tmp # If /tmp is small, set TMPDIR=/var/tmp before docker build.

When prune does not help

Sometimes docker system df reports lots of reclaimable space but docker system prune reclaims very little. Causes:

  1. Inodes exhausted (not blocks). df -i to check.
  2. Open file handles holding deleted files. Restart the daemon to release them.
  3. Volume contents are huge but the volume itself is in use. The volume is "active" (used by a running container) so prune skips it. Inspect volume contents via docker run --rm -v <vol>:/data alpine du -sh /data.
  4. Snapshots/COW chains. With devicemapper, you might need to recreate the storage pool. Migrate to overlay2.

Real-world usage

  • CI hosts: prune builder cache hourly; prune images daily; logs go to syslog.
  • Production app servers: log rotation in daemon.json; weekly prune cron; alerting on disk usage > 70%.
  • Single-host hobby: monthly docker system prune -af. Done.
  • Disk-emergency: docker system prune -af --volumes if you can confirm no detached but-needed volumes; otherwise prune images and builder first.

Common mistakes

Running prune --volumes in prod blindly

If a service is being recreated and its volume is briefly detached, prune deletes it. Always confirm volume usage:

bash
docker volume ls # Manually inspect any unfamiliar volume before pruning

Forgetting -a on docker image prune

Without -a, only dangling images (no tag) are removed. Tagged but unused images stay.

Ignoring container logs

A single chatty service can fill 50 GB in <id>-json.log while you wonder why disk is full and docker system df shows nothing unusual.

Putting /var/lib/docker on the OS partition with no monitoring

When disk fills, the daemon may go unstable; restart fails because logs cannot flush. Separate partition + alerting prevents this.

Follow-up questions

Q: What is the difference between docker prune and docker rm?


A: docker rm <name> removes a specific container; docker container prune removes all stopped containers in one shot. Same idea for images and volumes.

Q: Will pruning kill running services?


A: No. Prune commands skip resources that are active (running containers, mounted volumes, used images). They only touch genuinely unused things.

Q: How do I see what is in a volume before deleting it?


A: docker run --rm -v <volname>:/data alpine ls -la /data. If important data, back it up before pruning.

Q: (Senior) How do you build a long-term retention policy for build cache?


A: BuildKit supports cache backends (--cache-to=type=registry,ref=...). Push cache to a dedicated registry image; locally prune anything older than N days; rely on remote cache for shared CI. This bounds local disk while preserving cross-build deduplication.

Q: (Senior) Why does my disk fill faster than docker system df says?


A: docker system df does not include daemon.json-level state, the BuildKit's own metadata, or external mounts. Compare with du -sh /var/lib/docker/*. Discrepancies usually mean: orphaned overlay2 dirs from a daemon crash, very large container log files, or a host bind mount to a directory you forgot about.

Short Answer

Interview ready
Premium

A concise answer to help you respond confidently on this topic during an interview.

Comments

No comments yet