How to limit container CPU and memory resources?

docs.questions.sections.docker~4 min read

Without resource limits, a single misbehaving container can consume all the host's CPU or memory and take down everything else on the box. Docker exposes Linux cgroup limits through simple flags. Setting them is basic production hygiene.

Theory

TL;DR

Docker uses Linux cgroups to enforce limits.
Memory: --memory=512m (hard cap; container OOM-killed if exceeded).
CPU: --cpus=1.5 (throttle; container can use up to 1.5 cores worth of CPU time).
Reservations: --memory-reservation=256m (soft minimum).
Compose has deploy.resources.limits and deploy.resources.reservations.
Default = unlimited. A container with no limits can use the entire host.

Memory limits

bash

docker run -d --memory=512m --name api myapp
# Hard cap. If api allocates more than 512MB, kernel OOM-kills it.

docker run -d --memory=512m --memory-swap=1g myapp
# 512MB RAM + up to 512MB swap = 1GB total

docker run -d --memory=512m --memory-swap=512m myapp
# 512MB RAM, NO swap (set memory-swap = memory)

docker run -d --memory=512m --memory-swap=-1 myapp
# 512MB RAM, unlimited swap (DANGEROUS — fills disk)

Memory units: b (bytes), k (KiB), m (MiB), g (GiB). Default is bytes.

OOM behavior: when a container hits its memory limit, the Linux OOM killer terminates the process inside. Exit code 137. docker inspect shows OOMKilled: true.

CPU limits

bash

docker run -d --cpus=0.5 myapp
# 0.5 cores worth of CPU time. The container can spike to 100% of one core,
# but averaged over time stays at 50% of one core.

docker run -d --cpus=2 myapp
# 2 full cores worth.

docker run -d --cpu-shares=1024 myapp
# Relative weight (default 1024). Two containers with shares 1024 and 512 split CPU 2:1.
# Only matters under contention.

docker run -d --cpuset-cpus=0,1 myapp
# Pin to specific CPU cores (NUMA-aware).

CPU behavior: unlike memory, hitting the CPU limit does not kill the container — it just slows down. The process gets fewer CPU slices.

Compose syntax

yaml

services:
  api:
    image: myapp
    deploy:
      resources:
        limits:
          cpus: "0.5"
          memory: 512M
        reservations:
          cpus: "0.25"
          memory: 256M

In Compose v3+, deploy.resources works for both standalone Compose AND Swarm. In legacy v2 Compose, the syntax was cpus: and mem_limit: at the top level — still works for backward compat.

Reservations vs limits

Limit: the maximum the container can use. Hard cap for memory; throttle for CPU.
Reservation: the minimum guaranteed. The scheduler (Swarm, K8s) places the container on a node that can satisfy the reservation; under contention, reserved resources go to the container before others.

For single-host Compose, reservations matter less (only one host); for Swarm, they drive placement decisions.

Inspecting limits and usage

bash

# What limits a container has
docker inspect api --format '{{.HostConfig.Memory}} {{.HostConfig.NanoCpus}}'
# Memory in bytes, CPUs in nanoCPUs (1 CPU = 1e9)

# Live usage
docker stats --no-stream
CONTAINER ID   NAME    CPU %   MEM USAGE / LIMIT      MEM %
a3f9d2b8c1e4   api     45.2%   312MiB / 512MiB        60.94%

The MEM USAGE / LIMIT column makes contention obvious. Approaching 100% means OOM is likely.

Common mistakes

No limits in production

bash

docker run -d nginx       # no limits
docker run -d misbehaving-app   # no limits
# misbehaving-app eats all RAM; OS OOM-kills random processes;
# nginx might be a victim. The whole host gets unstable.

Fix: set sensible limits on every long-running container. Even a generous --memory=4g is much better than unlimited.

Memory limit too low for the runtime

bash

docker run --memory=64m java-app
# JVM probably starts then OOM-kills itself trying to allocate heap.

Language runtimes have minimum overhead. JVM, Node, Python all need 50-100MB just to start. Pick a limit that fits the workload + a margin.

Confusing cpu-shares with cpus

--cpus=0.5 = absolute throttle (50% of one core, always).
--cpu-shares=512 = relative weight (only matters under contention; with no contention, the container can use all available CPU).

For predictable performance, prefer --cpus. cpu-shares is for prioritizing among containers when the host is fully busy.

Forgetting the JVM does not see container limits without help

On old JVMs (pre-Java 10), the JVM looked at host memory, not the cgroup limit, and tried to use the host's RAM. Result: container OOM-killed. Modern JVMs (Java 10+) are container-aware by default. For older JVMs, use -XX:MaxRAMPercentage=75.0 or -Xmx. Same caveat applies to some Node and Python tools.

Real-world usage

Production: every long-running container has memory + CPU limits. Sized based on observed P99 usage + buffer.
Multi-tenant nodes: strict limits prevent one tenant from starving others.
CI runners: --cpus=2 --memory=4g per build container so parallel jobs do not slow each other.
Local dev: sometimes worth setting modest limits to catch regressions early (a memory leak that only shows up at 8GB never shows up if your laptop has 32GB).

Follow-up questions

Q: What is the difference between --memory and --memory-swap?

A: --memory is RAM only. --memory-swap is total RAM + swap. Set --memory-swap = --memory to disable swap entirely (recommended for predictable performance).

Q: Why did my container exit 137 even though I think it had memory left?

A: Possibilities: (1) OOM killer chose your container's main process due to OOM score. (2) kill -9 from outside. (3) Daemon's grace period elapsed during stop. Check docker inspect <name> for OOMKilled: true to confirm OOM specifically.

Q: Can I update limits without restarting?

A: Yes, with docker update: docker update --memory=1g --cpus=1 api. The change applies immediately to the running container — no restart needed.

Q: How do limits work with --privileged?

A: Limits still apply. --privileged lifts capability restrictions (lets the container do raw block I/O, etc.) but does NOT remove cgroup limits.

Q: (Senior) How do you size memory limits in practice?

A: Run the workload realistically (load test, prod traffic) without limits. Watch docker stats peak memory. Set limit at peak + 30-50% headroom. Repeat after every meaningful code change. For JVMs/Pythons, account for the runtime's overhead and any caches. Rule of thumb: too tight kills the app on traffic spikes; too loose lets memory leaks go unnoticed; the right number is just slightly above worst-case observed.

Examples

Sized service with monitoring

bash

$ docker run -d \
    --name api \
    --memory=512m \
    --memory-reservation=256m \
    --cpus=1 \
    --restart=unless-stopped \
    myapp:1.0

$ docker stats --no-stream api
CONTAINER   CPU %   MEM USAGE / LIMIT     MEM %
api         12.3%   220MiB / 512MiB       43.0%

Usage tracking is built-in. Alert (via Prometheus or similar) when MEM% > 80% sustained.

Compose with reservations

yaml

services:
  api:
    image: myapp
    deploy:
      resources:
        limits:
          cpus: "1"
          memory: 512M
        reservations:
          cpus: "0.5"
          memory: 256M
  db:
    image: postgres:16
    deploy:
      resources:
        limits:
          memory: 1G

DB gets a higher memory ceiling because Postgres caches the working set. API is CPU-tighter.

Update a running container

bash

$ docker stats --no-stream api
MEM USAGE / LIMIT     MEM %
450MiB / 512MiB       87.9%
# Approaching limit

$ docker update --memory=1g api
api

$ docker stats --no-stream api
MEM USAGE / LIMIT     MEM %
450MiB / 1GiB         43.9%

No restart needed. Useful for emergency response without redeploying.

Short Answer

Interview ready

Premium

A concise answer to help you respond confidently on this topic during an interview.

Comments

No comments yet