How to limit container CPU and memory resources?
Without resource limits, a single misbehaving container can consume all the host's CPU or memory and take down everything else on the box. Docker exposes Linux cgroup limits through simple flags. Setting them is basic production hygiene.
Theory
TL;DR
- Docker uses Linux cgroups to enforce limits.
- Memory:
--memory=512m(hard cap; container OOM-killed if exceeded). - CPU:
--cpus=1.5(throttle; container can use up to 1.5 cores worth of CPU time). - Reservations:
--memory-reservation=256m(soft minimum). - Compose has
deploy.resources.limitsanddeploy.resources.reservations. - Default = unlimited. A container with no limits can use the entire host.
Memory limits
docker run -d --memory=512m --name api myapp
# Hard cap. If api allocates more than 512MB, kernel OOM-kills it.
docker run -d --memory=512m --memory-swap=1g myapp
# 512MB RAM + up to 512MB swap = 1GB total
docker run -d --memory=512m --memory-swap=512m myapp
# 512MB RAM, NO swap (set memory-swap = memory)
docker run -d --memory=512m --memory-swap=-1 myapp
# 512MB RAM, unlimited swap (DANGEROUS — fills disk)Memory units: b (bytes), k (KiB), m (MiB), g (GiB). Default is bytes.
OOM behavior: when a container hits its memory limit, the Linux OOM killer terminates the process inside. Exit code 137. docker inspect shows OOMKilled: true.
CPU limits
docker run -d --cpus=0.5 myapp
# 0.5 cores worth of CPU time. The container can spike to 100% of one core,
# but averaged over time stays at 50% of one core.
docker run -d --cpus=2 myapp
# 2 full cores worth.
docker run -d --cpu-shares=1024 myapp
# Relative weight (default 1024). Two containers with shares 1024 and 512 split CPU 2:1.
# Only matters under contention.
docker run -d --cpuset-cpus=0,1 myapp
# Pin to specific CPU cores (NUMA-aware).CPU behavior: unlike memory, hitting the CPU limit does not kill the container — it just slows down. The process gets fewer CPU slices.
Compose syntax
services:
api:
image: myapp
deploy:
resources:
limits:
cpus: "0.5"
memory: 512M
reservations:
cpus: "0.25"
memory: 256MIn Compose v3+, deploy.resources works for both standalone Compose AND Swarm. In legacy v2 Compose, the syntax was cpus: and mem_limit: at the top level — still works for backward compat.
Reservations vs limits
- Limit: the maximum the container can use. Hard cap for memory; throttle for CPU.
- Reservation: the minimum guaranteed. The scheduler (Swarm, K8s) places the container on a node that can satisfy the reservation; under contention, reserved resources go to the container before others.
For single-host Compose, reservations matter less (only one host); for Swarm, they drive placement decisions.
Inspecting limits and usage
# What limits a container has
docker inspect api --format '{{.HostConfig.Memory}} {{.HostConfig.NanoCpus}}'
# Memory in bytes, CPUs in nanoCPUs (1 CPU = 1e9)
# Live usage
docker stats --no-stream
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM %
a3f9d2b8c1e4 api 45.2% 312MiB / 512MiB 60.94%The MEM USAGE / LIMIT column makes contention obvious. Approaching 100% means OOM is likely.
Common mistakes
No limits in production
docker run -d nginx # no limits
docker run -d misbehaving-app # no limits
# misbehaving-app eats all RAM; OS OOM-kills random processes;
# nginx might be a victim. The whole host gets unstable.Fix: set sensible limits on every long-running container. Even a generous --memory=4g is much better than unlimited.
Memory limit too low for the runtime
docker run --memory=64m java-app
# JVM probably starts then OOM-kills itself trying to allocate heap.Language runtimes have minimum overhead. JVM, Node, Python all need 50-100MB just to start. Pick a limit that fits the workload + a margin.
Confusing cpu-shares with cpus
--cpus=0.5= absolute throttle (50% of one core, always).--cpu-shares=512= relative weight (only matters under contention; with no contention, the container can use all available CPU).
For predictable performance, prefer --cpus. cpu-shares is for prioritizing among containers when the host is fully busy.
Forgetting the JVM does not see container limits without help
On old JVMs (pre-Java 10), the JVM looked at host memory, not the cgroup limit, and tried to use the host's RAM. Result: container OOM-killed. Modern JVMs (Java 10+) are container-aware by default. For older JVMs, use -XX:MaxRAMPercentage=75.0 or -Xmx. Same caveat applies to some Node and Python tools.
Real-world usage
- Production: every long-running container has memory + CPU limits. Sized based on observed P99 usage + buffer.
- Multi-tenant nodes: strict limits prevent one tenant from starving others.
- CI runners:
--cpus=2 --memory=4gper build container so parallel jobs do not slow each other. - Local dev: sometimes worth setting modest limits to catch regressions early (a memory leak that only shows up at 8GB never shows up if your laptop has 32GB).
Follow-up questions
Q: What is the difference between --memory and --memory-swap?
A: --memory is RAM only. --memory-swap is total RAM + swap. Set --memory-swap = --memory to disable swap entirely (recommended for predictable performance).
Q: Why did my container exit 137 even though I think it had memory left?
A: Possibilities: (1) OOM killer chose your container's main process due to OOM score. (2) kill -9 from outside. (3) Daemon's grace period elapsed during stop. Check docker inspect <name> for OOMKilled: true to confirm OOM specifically.
Q: Can I update limits without restarting?
A: Yes, with docker update: docker update --memory=1g --cpus=1 api. The change applies immediately to the running container — no restart needed.
Q: How do limits work with --privileged?
A: Limits still apply. --privileged lifts capability restrictions (lets the container do raw block I/O, etc.) but does NOT remove cgroup limits.
Q: (Senior) How do you size memory limits in practice?
A: Run the workload realistically (load test, prod traffic) without limits. Watch docker stats peak memory. Set limit at peak + 30-50% headroom. Repeat after every meaningful code change. For JVMs/Pythons, account for the runtime's overhead and any caches. Rule of thumb: too tight kills the app on traffic spikes; too loose lets memory leaks go unnoticed; the right number is just slightly above worst-case observed.
Examples
Sized service with monitoring
$ docker run -d \
--name api \
--memory=512m \
--memory-reservation=256m \
--cpus=1 \
--restart=unless-stopped \
myapp:1.0
$ docker stats --no-stream api
CONTAINER CPU % MEM USAGE / LIMIT MEM %
api 12.3% 220MiB / 512MiB 43.0%Usage tracking is built-in. Alert (via Prometheus or similar) when MEM% > 80% sustained.
Compose with reservations
services:
api:
image: myapp
deploy:
resources:
limits:
cpus: "1"
memory: 512M
reservations:
cpus: "0.5"
memory: 256M
db:
image: postgres:16
deploy:
resources:
limits:
memory: 1GDB gets a higher memory ceiling because Postgres caches the working set. API is CPU-tighter.
Update a running container
$ docker stats --no-stream api
MEM USAGE / LIMIT MEM %
450MiB / 512MiB 87.9%
# Approaching limit
$ docker update --memory=1g api
api
$ docker stats --no-stream api
MEM USAGE / LIMIT MEM %
450MiB / 1GiB 43.9%No restart needed. Useful for emergency response without redeploying.
Short Answer
Interview readyA concise answer to help you respond confidently on this topic during an interview.
Comments
No comments yet