docker inspect --format='{{.State.OOMKilled}}' — if
true, the container was killed for memory.docker stats shows live memory usage.
docker inspect <cid> --format='{{.State.OOMKilled}}' confirms OOM.
docker run --memory=512m sets limits. dmesg | grep -i oom shows kernel OOM
events.--memory without --memory-swap defaults swap
to 2× memory. Java's -Xmx must be lower than --memory to leave room for
non-heap. On cgroups v2, --oom-kill-disable is silently discarded.--oom-kill-disable on cgroups v2 hosts — the flag is silently discarded in
Docker Engine 27+; the container will still be OOM-killed [src7]-Xmx) must be set to 70-80% of the container memory limit — setting it
equal to --memory guarantees OOMKilled because metaspace, threads, and GC buffers need the
remaining 20-30% [src1, src4]--memory flags [src1]docker inspect --format='{{.State.OOMKilled}}' before assuming a memory issue [src3, src4]--kernel-memory flag is discarded on cgroups v2 — do not rely on it for kernel
memory limits on modern Linux hosts [src7]| # | Cause | Likelihood | Signature | Fix |
|---|---|---|---|---|
| 1 | Container memory limit too low | ~30% | OOMKilled: true; app uses expected memory |
Increase --memory limit [src1, src4] |
| 2 | Application memory leak | ~25% | Memory grows linearly; OOM after hours/days | Profile and fix the leak [src4, src8] |
| 3 | JVM heap exceeds container limit | ~15% | Java app; -Xmx ≥ --memory |
Set -Xmx to 70-80% of container memory [src1, src4] |
| 4 | Large file/data processing | ~8% | OOM during specific operations | Use streaming/chunked processing [src8] |
| 5 | Fork bomb / child process explosion | ~5% | Memory spikes instantly; many processes | Set --pids-limit; fix fork logic [src1] |
| 6 | Host itself is out of memory | ~5% | Multiple containers OOMKilled; dmesg shows OOM |
Add host RAM; reduce container count [src4, src6] |
| 7 | No memory limit + host exhaustion | ~4% | No --memory flag; host runs out |
Always set memory limits in production [src1] |
| 8 | Memory-mapped files / tmpfs | ~3% | Container uses tmpfs or mmap | Exclude tmpfs or increase limit [src1] |
| 9 | Swap disabled or misconfigured | ~3% | OOMs at exactly the --memory value |
Configure --memory-swap [src1] |
| 10 | Build-time OOM | ~2% | OOM during npm install, compilation |
Increase Docker Desktop memory [src1] |
START — Container exits with code 137
├── Is OOMKilled true? (docker inspect --format='{{.State.OOMKilled}}')
│ ├── YES → Out of memory ↓
│ └── NO → Not OOM — killed by docker kill, docker stop timeout, or orchestrator
├── Was a memory limit set? (docker inspect --format='{{.HostConfig.Memory}}')
│ ├── YES → Container exceeded this limit
│ │ ├── Is the limit too low? → Increase --memory [src1]
│ │ ├── Gradual growth (leak)? → Profile the application [src4, src8]
│ │ └── Java/JVM? → Check -Xmx (leave 25-30% for non-heap) [src1]
│ └── NO → Host ran out of memory
│ ├── Check host: free -m, dmesg | grep oom [src6]
│ └── Set limits on all containers [src1]
├── cgroups version? (stat -fc %T /sys/fs/cgroup)
│ ├── cgroup2fs → v2: check memory.max, memory.events [src6]
│ └── tmpfs → v1: check memory.limit_in_bytes, memory.oom_control [src6]
├── Build-time OOM? → Increase Docker Desktop memory [src1]
└── Monitor with docker stats to find peak usage [src2]
├── Peak near limit? → Increase limit by 20-30%
└── Sudden spike? → Profile that code path
Not every exit code 137 is OOM — it can also be a manual docker kill. [src3, src4]
docker inspect <cid> --format='{{.State.OOMKilled}}'
docker inspect <cid> --format='{{json .State}}' | python -m json.tool
dmesg | grep -i "oom\|out of memory" | tail -20
# cgroups v2: check OOM event count
cat /sys/fs/cgroup/system.slice/docker-<id>.scope/memory.events
Verify: OOMKilled: true confirms the container was killed for exceeding memory.
Understand what limits are configured. [src1, src3]
docker inspect <cid> --format='Memory: {{.HostConfig.Memory}}, Swap: {{.HostConfig.MemorySwap}}'
docker inspect <cid> --format='{{.HostConfig.Memory}}' | awk '{print $1/1024/1024 "MB"}'
# Check cgroups version
stat -fc %T /sys/fs/cgroup # cgroup2fs = v2, tmpfs = v1
Verify: Know the exact limit (0 = unlimited).
Use docker stats to see real-time memory consumption. [src2]
docker stats
docker stats --no-stream
docker stats --no-stream --format "table {{.Name}}\t{{.MemUsage}}\t{{.MemPerc}}\t{{.PIDs}}"
Verify: Watch memory over time — gradual growth = leak; spikes = load-dependent.
Configure limits based on observed usage plus headroom. [src1]
docker run --memory=512m --memory-swap=1g myapp
docker run --memory=512m --memory-swap=512m myapp # No swap
docker run --memory=1g --memory-reservation=512m myapp # Soft limit
# Docker Compose
services:
app:
deploy:
resources:
limits:
memory: 512M
reservations:
memory: 256M
Runtime-specific settings must align with container limits. [src1, src4]
# Java 10+ (container-aware)
docker run --memory=1g myapp java -XX:MaxRAMPercentage=75.0 -jar app.jar
# Node.js
docker run --memory=512m myapp node --max-old-space-size=384 app.js
Get detailed memory breakdown. [src4, src8]
docker exec <cid> ps aux --sort=-%mem | head -20
docker exec <cid> cat /proc/meminfo | head -10
docker exec <cid> cat /proc/1/status | grep -E "VmRSS|VmSize|VmPeak"
# cgroups v2: check memory pressure
docker exec <cid> cat /sys/fs/cgroup/memory.pressure 2>/dev/null
Proactive monitoring before OOM kills. [src5]
#!/bin/bash
THRESHOLD=80
while true; do
docker stats --no-stream --format '{{.Name}} {{.MemPerc}}' | while read name pct; do
pct_num=$(echo "$pct" | tr -d '%')
if (( $(echo "$pct_num > $THRESHOLD" | bc -l) )); then
echo "WARNING $(date): $name at ${pct}% memory"
fi
done
sleep 30
done
Full script: container-memory-diagnostics-script.sh (43 lines)
#!/bin/bash
# Input: Container ID or name
# Output: Complete memory diagnostic report
CID="$1"
if [ -z "$CID" ]; then echo "Usage: $0 <container_id>"; exit 1; fi
echo "=== Memory Diagnostic Report ==="
echo "Container: $CID"
echo "Time: $(date -u)"
STATE=$(docker inspect "$CID" --format='{{.State.Status}}' 2>/dev/null)
if [ -z "$STATE" ]; then echo "Container not found"; exit 1; fi
OOM=$(docker inspect "$CID" --format='{{.State.OOMKilled}}')
EXIT=$(docker inspect "$CID" --format='{{.State.ExitCode}}')
echo "Status: $STATE | Exit: $EXIT | OOMKilled: $OOM"
MEM=$(docker inspect "$CID" --format='{{.HostConfig.Memory}}')
echo "Memory limit: $(echo "$MEM" | awk '{if($1>0) printf "%.0fMB\n",$1/1024/1024; else print "unlimited"}')"
if [ "$STATE" = "running" ]; then
echo "=== Current Usage ==="
docker stats --no-stream --format "Memory: {{.MemUsage}} ({{.MemPerc}})" "$CID"
echo "=== Top Processes ==="
docker exec "$CID" ps aux --sort=-%mem 2>/dev/null | head -6
fi
echo "=== Last 10 Log Lines ==="
docker logs --tail 10 "$CID" 2>&1
echo "=== Host OOM Events ==="
dmesg 2>/dev/null | grep -i "oom\|killed process" | tail -5 || echo "(unavailable)"
Full script: docker-compose-with-proper-memory-management.yml (62 lines)
version: "3.8"
services:
api:
build: ./api
deploy:
resources:
limits:
memory: 512M
cpus: "1.0"
reservations:
memory: 256M
environment:
JAVA_OPTS: "-XX:MaxRAMPercentage=75.0 -XX:+UseG1GC"
NODE_OPTIONS: "--max-old-space-size=384"
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
interval: 30s
retries: 3
worker:
build: ./worker
deploy:
resources:
limits:
memory: 1G
restart: on-failure:5
postgres:
image: postgres:16
deploy:
resources:
limits:
memory: 1G
environment:
POSTGRES_SHARED_BUFFERS: "256MB"
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 10s
redis:
image: redis:7-alpine
deploy:
resources:
limits:
memory: 256M
command: ["redis-server", "--maxmemory", "200mb", "--maxmemory-policy", "allkeys-lru"]
Full script: memory-leak-detection-in-running-container.py (50 lines)
#!/usr/bin/env python3
# Input: Container name to monitor
# Output: Alert when memory growth exceeds threshold
import subprocess, json, time, sys
from datetime import datetime
def get_container_memory(cid):
result = subprocess.run(
["docker", "stats", "--no-stream", "--format", "{{json .}}", cid],
capture_output=True, text=True
)
if result.returncode != 0:
return None
data = json.loads(result.stdout.strip())
usage_str = data.get("MemUsage", "").split("/")[0].strip()
if "GiB" in usage_str:
return float(usage_str.replace("GiB", "")) * 1024 * 1024 * 1024
elif "MiB" in usage_str:
return float(usage_str.replace("MiB", "")) * 1024 * 1024
return 0
def monitor(cid, interval=30, growth_threshold_mb=50, samples=10):
history = []
print(f"Monitoring {cid} every {interval}s...")
while True:
mem = get_container_memory(cid)
if mem is None:
print(f"WARNING: Container not found"); break
history.append((datetime.now(), mem))
mem_mb = mem / 1024 / 1024
print(f"[{datetime.now():%H:%M:%S}] {mem_mb:.1f} MB")
if len(history) >= samples:
growth_mb = (mem - history[-samples][1]) / 1024 / 1024
if growth_mb > growth_threshold_mb:
rate = growth_mb / (samples * interval / 60)
print(f"ALERT: +{growth_mb:.1f}MB ({rate:.1f} MB/min)")
time.sleep(interval)
if __name__ == "__main__":
monitor(sys.argv[1] if len(sys.argv) > 1 else "myapp")
# BAD — container can consume all host memory [src1]
docker run -d myapp
# GOOD — constrained to 512MB [src1]
docker run -d --memory=512m --memory-swap=1g myapp
# BAD — no room for non-heap memory [src1, src4]
docker run --memory=1g myapp java -Xmx1g -jar app.jar
# GOOD — leaves 25-30% for metaspace, threads, buffers [src1, src4]
docker run --memory=1g myapp java -XX:MaxRAMPercentage=75.0 -jar app.jar
# BAD — infinite restart loop [src4]
services:
app:
restart: always
# GOOD — stops after 5 failures [src4]
services:
app:
restart: on-failure:5
deploy:
resources:
limits:
memory: 512M
# BAD — flag silently discarded on cgroups v2 (Docker 27+) [src7]
docker run --memory=512m --oom-kill-disable myapp
# GOOD — right-size limit and monitor proactively [src1, src5]
docker run --memory=512m --memory-reservation=384m myapp
docker kill, docker stop
timeout, or orchestrator. Check OOMKilled flag. [src3, src4]--memory without --memory-swap: Swap defaults to 2× memory on
Linux Engine, may be disabled on Docker Desktop. Set explicitly. [src1]UseContainerSupport or set -Xmx. [src1, src4]docker stats includes page cache: High usage may be reclaimable cache, not
actual pressure. Kernel evicts cache before OOM. [src2]--tmpfs and /dev/shm count
against the limit. Large temp files can trigger OOM. [src1]docker build runs in a container too.
npm install, compilation spike memory. Increase Docker Desktop allocation. [src1]--oom-kill-disable and --kernel-memory discarded on cgroups v2:
Docker Engine 27+ silently ignores these flags on cgroups v2 hosts. [src7]memory.high.
This causes slowness without kills — check memory.pressure for stall indicators. [src5, src6]# === Confirm OOM ===
docker inspect <cid> --format='OOMKilled: {{.State.OOMKilled}}, Exit: {{.State.ExitCode}}'
dmesg | grep -i "oom\|killed process" | tail -10
# === Memory Usage ===
docker stats
docker stats --no-stream --format "table {{.Name}}\t{{.MemUsage}}\t{{.MemPerc}}"
# === Memory Limits ===
docker inspect <cid> --format='Memory: {{.HostConfig.Memory}}, Swap: {{.HostConfig.MemorySwap}}'
# === Inside Container ===
docker exec <cid> cat /proc/meminfo | head -5
docker exec <cid> cat /proc/1/status | grep VmRSS
docker exec <cid> ps aux --sort=-%mem | head -10
# === cgroups v2 ===
cat /sys/fs/cgroup/system.slice/docker-<id>.scope/memory.current
cat /sys/fs/cgroup/system.slice/docker-<id>.scope/memory.max
cat /sys/fs/cgroup/system.slice/docker-<id>.scope/memory.events
cat /sys/fs/cgroup/system.slice/docker-<id>.scope/memory.pressure
# === cgroups v1 (Legacy) ===
cat /sys/fs/cgroup/memory/docker/<id>/memory.usage_in_bytes
cat /sys/fs/cgroup/memory/docker/<id>/memory.limit_in_bytes
# === Host Memory ===
free -m
vmstat 1 5
# === cgroups version check ===
stat -fc %T /sys/fs/cgroup # cgroup2fs = v2, tmpfs = v1
| Version | Behavior | Key Changes |
|---|---|---|
| Docker 27+ (2024) | Current | OOMScoreAdj for services; --oom-kill-disable and --kernel-memory
discarded on cgroups v2; containerd default [src7] |
| Docker 25-26 | Stable | Improved memory metrics; containerd integration [src1] |
| Docker 24 | Stable | cgroups v2 fully supported; BuildKit default [src1, src6] |
| Docker 23 | Stable | Compose V2 default [src1] |
| Docker 20.10 | LTS-like | cgroups v2 support [src1, src6] |
| Docker 19.03 | Legacy | --memory-swap standardized [src1] |
| cgroups v2 | Modern Linux | memory.max; memory.events for OOM count; memory.pressure
for stall tracking; memory.high soft throttling [src6] |
| cgroups v1 | Legacy Linux | memory.limit_in_bytes + memory.oom_control;
--oom-kill-disable works [src6] |
| Use When | Don't Use When | Use Instead |
|---|---|---|
| Exit 137 and OOMKilled=true | Exit 137 but OOMKilled=false | Check for docker kill or stop timeout |
| Memory grows over time in stats | Memory is constant but app is slow | CPU profiling |
| Java/Node process needs right-sizing | Build step fails with OOM | Increase Docker Desktop memory |
| Multiple containers compete for memory | Single container with headroom | Other issues (CPU, IO, network) |
| Container keeps restarting with exit 137 | Container runs but performance degrades | Check cgroups v2 memory.pressure for throttling |
--memory limits, total across all containers cannot exceed the
VM's allocation. Docker Desktop 4.38 introduced a known regression causing higher baseline memory
consumption.OOMKilled flag only indicates PID 1 was OOM-killed. Child process OOM kills may show
OOMKilled: false. Check dmesg.docker stats memory includes page cache. High usage may be reclaimable — kernel evicts
cache before OOM.resources.limits.memory. On K8s 1.28+ with
cgroups v2, the cgroup-aware OOM killer terminates ALL processes in the cgroup. A
singleProcessOOMKill kubelet flag is being developed to restore the old behavior.--memory without OOM (using swap). This
causes slowness, not kills. Monitor swap separately.