How to Diagnose and Fix Docker OOMKilled Errors

Type: Software Reference Confidence: 0.93 Sources: 8 Verified: 2026-02-23 Freshness: quarterly

TL;DR

Constraints

Quick Reference

# Cause Likelihood Signature Fix
1 Container memory limit too low ~30% OOMKilled: true; app uses expected memory Increase --memory limit [src1, src4]
2 Application memory leak ~25% Memory grows linearly; OOM after hours/days Profile and fix the leak [src4, src8]
3 JVM heap exceeds container limit ~15% Java app; -Xmx--memory Set -Xmx to 70-80% of container memory [src1, src4]
4 Large file/data processing ~8% OOM during specific operations Use streaming/chunked processing [src8]
5 Fork bomb / child process explosion ~5% Memory spikes instantly; many processes Set --pids-limit; fix fork logic [src1]
6 Host itself is out of memory ~5% Multiple containers OOMKilled; dmesg shows OOM Add host RAM; reduce container count [src4, src6]
7 No memory limit + host exhaustion ~4% No --memory flag; host runs out Always set memory limits in production [src1]
8 Memory-mapped files / tmpfs ~3% Container uses tmpfs or mmap Exclude tmpfs or increase limit [src1]
9 Swap disabled or misconfigured ~3% OOMs at exactly the --memory value Configure --memory-swap [src1]
10 Build-time OOM ~2% OOM during npm install, compilation Increase Docker Desktop memory [src1]

Decision Tree

START — Container exits with code 137
├── Is OOMKilled true? (docker inspect --format='{{.State.OOMKilled}}')
│   ├── YES → Out of memory ↓
│   └── NO → Not OOM — killed by docker kill, docker stop timeout, or orchestrator
├── Was a memory limit set? (docker inspect --format='{{.HostConfig.Memory}}')
│   ├── YES → Container exceeded this limit
│   │   ├── Is the limit too low? → Increase --memory [src1]
│   │   ├── Gradual growth (leak)? → Profile the application [src4, src8]
│   │   └── Java/JVM? → Check -Xmx (leave 25-30% for non-heap) [src1]
│   └── NO → Host ran out of memory
│       ├── Check host: free -m, dmesg | grep oom [src6]
│       └── Set limits on all containers [src1]
├── cgroups version? (stat -fc %T /sys/fs/cgroup)
│   ├── cgroup2fs → v2: check memory.max, memory.events [src6]
│   └── tmpfs → v1: check memory.limit_in_bytes, memory.oom_control [src6]
├── Build-time OOM? → Increase Docker Desktop memory [src1]
└── Monitor with docker stats to find peak usage [src2]
    ├── Peak near limit? → Increase limit by 20-30%
    └── Sudden spike? → Profile that code path

Step-by-Step Guide

1. Confirm the OOM kill

Not every exit code 137 is OOM — it can also be a manual docker kill. [src3, src4]

docker inspect <cid> --format='{{.State.OOMKilled}}'
docker inspect <cid> --format='{{json .State}}' | python -m json.tool
dmesg | grep -i "oom\|out of memory" | tail -20
# cgroups v2: check OOM event count
cat /sys/fs/cgroup/system.slice/docker-<id>.scope/memory.events

Verify: OOMKilled: true confirms the container was killed for exceeding memory.

2. Check current memory limits

Understand what limits are configured. [src1, src3]

docker inspect <cid> --format='Memory: {{.HostConfig.Memory}}, Swap: {{.HostConfig.MemorySwap}}'
docker inspect <cid> --format='{{.HostConfig.Memory}}' | awk '{print $1/1024/1024 "MB"}'
# Check cgroups version
stat -fc %T /sys/fs/cgroup   # cgroup2fs = v2, tmpfs = v1

Verify: Know the exact limit (0 = unlimited).

3. Monitor live memory usage

Use docker stats to see real-time memory consumption. [src2]

docker stats
docker stats --no-stream
docker stats --no-stream --format "table {{.Name}}\t{{.MemUsage}}\t{{.MemPerc}}\t{{.PIDs}}"

Verify: Watch memory over time — gradual growth = leak; spikes = load-dependent.

4. Set appropriate memory limits

Configure limits based on observed usage plus headroom. [src1]

docker run --memory=512m --memory-swap=1g myapp
docker run --memory=512m --memory-swap=512m myapp  # No swap
docker run --memory=1g --memory-reservation=512m myapp  # Soft limit
# Docker Compose
services:
  app:
    deploy:
      resources:
        limits:
          memory: 512M
        reservations:
          memory: 256M

5. Configure JVM / Node.js / Python memory

Runtime-specific settings must align with container limits. [src1, src4]

# Java 10+ (container-aware)
docker run --memory=1g myapp java -XX:MaxRAMPercentage=75.0 -jar app.jar

# Node.js
docker run --memory=512m myapp node --max-old-space-size=384 app.js

6. Profile memory usage inside the container

Get detailed memory breakdown. [src4, src8]

docker exec <cid> ps aux --sort=-%mem | head -20
docker exec <cid> cat /proc/meminfo | head -10
docker exec <cid> cat /proc/1/status | grep -E "VmRSS|VmSize|VmPeak"
# cgroups v2: check memory pressure
docker exec <cid> cat /sys/fs/cgroup/memory.pressure 2>/dev/null

7. Set up memory alerts

Proactive monitoring before OOM kills. [src5]

#!/bin/bash
THRESHOLD=80
while true; do
    docker stats --no-stream --format '{{.Name}} {{.MemPerc}}' | while read name pct; do
        pct_num=$(echo "$pct" | tr -d '%')
        if (( $(echo "$pct_num > $THRESHOLD" | bc -l) )); then
            echo "WARNING $(date): $name at ${pct}% memory"
        fi
    done
    sleep 30
done

Code Examples

Container memory diagnostics script

Full script: container-memory-diagnostics-script.sh (43 lines)

#!/bin/bash
# Input:  Container ID or name
# Output: Complete memory diagnostic report

CID="$1"
if [ -z "$CID" ]; then echo "Usage: $0 <container_id>"; exit 1; fi

echo "=== Memory Diagnostic Report ==="
echo "Container: $CID"
echo "Time: $(date -u)"

STATE=$(docker inspect "$CID" --format='{{.State.Status}}' 2>/dev/null)
if [ -z "$STATE" ]; then echo "Container not found"; exit 1; fi

OOM=$(docker inspect "$CID" --format='{{.State.OOMKilled}}')
EXIT=$(docker inspect "$CID" --format='{{.State.ExitCode}}')
echo "Status: $STATE | Exit: $EXIT | OOMKilled: $OOM"

MEM=$(docker inspect "$CID" --format='{{.HostConfig.Memory}}')
echo "Memory limit: $(echo "$MEM" | awk '{if($1>0) printf "%.0fMB\n",$1/1024/1024; else print "unlimited"}')"

if [ "$STATE" = "running" ]; then
    echo "=== Current Usage ==="
    docker stats --no-stream --format "Memory: {{.MemUsage}} ({{.MemPerc}})" "$CID"
    echo "=== Top Processes ==="
    docker exec "$CID" ps aux --sort=-%mem 2>/dev/null | head -6
fi

echo "=== Last 10 Log Lines ==="
docker logs --tail 10 "$CID" 2>&1

echo "=== Host OOM Events ==="
dmesg 2>/dev/null | grep -i "oom\|killed process" | tail -5 || echo "(unavailable)"

Docker Compose with proper memory management

Full script: docker-compose-with-proper-memory-management.yml (62 lines)

version: "3.8"
services:
  api:
    build: ./api
    deploy:
      resources:
        limits:
          memory: 512M
          cpus: "1.0"
        reservations:
          memory: 256M
    environment:
      JAVA_OPTS: "-XX:MaxRAMPercentage=75.0 -XX:+UseG1GC"
      NODE_OPTIONS: "--max-old-space-size=384"
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
      interval: 30s
      retries: 3

  worker:
    build: ./worker
    deploy:
      resources:
        limits:
          memory: 1G
    restart: on-failure:5

  postgres:
    image: postgres:16
    deploy:
      resources:
        limits:
          memory: 1G
    environment:
      POSTGRES_SHARED_BUFFERS: "256MB"
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 10s

  redis:
    image: redis:7-alpine
    deploy:
      resources:
        limits:
          memory: 256M
    command: ["redis-server", "--maxmemory", "200mb", "--maxmemory-policy", "allkeys-lru"]

Memory leak detection in running container

Full script: memory-leak-detection-in-running-container.py (50 lines)

#!/usr/bin/env python3
# Input:  Container name to monitor
# Output: Alert when memory growth exceeds threshold

import subprocess, json, time, sys
from datetime import datetime

def get_container_memory(cid):
    result = subprocess.run(
        ["docker", "stats", "--no-stream", "--format", "{{json .}}", cid],
        capture_output=True, text=True
    )
    if result.returncode != 0:
        return None
    data = json.loads(result.stdout.strip())
    usage_str = data.get("MemUsage", "").split("/")[0].strip()
    if "GiB" in usage_str:
        return float(usage_str.replace("GiB", "")) * 1024 * 1024 * 1024
    elif "MiB" in usage_str:
        return float(usage_str.replace("MiB", "")) * 1024 * 1024
    return 0

def monitor(cid, interval=30, growth_threshold_mb=50, samples=10):
    history = []
    print(f"Monitoring {cid} every {interval}s...")
    while True:
        mem = get_container_memory(cid)
        if mem is None:
            print(f"WARNING: Container not found"); break
        history.append((datetime.now(), mem))
        mem_mb = mem / 1024 / 1024
        print(f"[{datetime.now():%H:%M:%S}] {mem_mb:.1f} MB")
        if len(history) >= samples:
            growth_mb = (mem - history[-samples][1]) / 1024 / 1024
            if growth_mb > growth_threshold_mb:
                rate = growth_mb / (samples * interval / 60)
                print(f"ALERT: +{growth_mb:.1f}MB ({rate:.1f} MB/min)")
        time.sleep(interval)

if __name__ == "__main__":
    monitor(sys.argv[1] if len(sys.argv) > 1 else "myapp")

Anti-Patterns

Wrong: No memory limit in production

# BAD — container can consume all host memory [src1]
docker run -d myapp

Correct: Always set memory limits

# GOOD — constrained to 512MB [src1]
docker run -d --memory=512m --memory-swap=1g myapp

Wrong: JVM heap equals container memory

# BAD — no room for non-heap memory [src1, src4]
docker run --memory=1g myapp java -Xmx1g -jar app.jar

Correct: JVM heap at 70-80% of container memory

# GOOD — leaves 25-30% for metaspace, threads, buffers [src1, src4]
docker run --memory=1g myapp java -XX:MaxRAMPercentage=75.0 -jar app.jar

Wrong: Restarting OOM containers without fixing root cause

# BAD — infinite restart loop [src4]
services:
  app:
    restart: always

Correct: Limit restarts and investigate

# GOOD — stops after 5 failures [src4]
services:
  app:
    restart: on-failure:5
    deploy:
      resources:
        limits:
          memory: 512M

Wrong: Using --oom-kill-disable on cgroups v2

# BAD — flag silently discarded on cgroups v2 (Docker 27+) [src7]
docker run --memory=512m --oom-kill-disable myapp

Correct: Set appropriate limits and monitor instead

# GOOD — right-size limit and monitor proactively [src1, src5]
docker run --memory=512m --memory-reservation=384m myapp

Common Pitfalls

Diagnostic Commands

# === Confirm OOM ===
docker inspect <cid> --format='OOMKilled: {{.State.OOMKilled}}, Exit: {{.State.ExitCode}}'
dmesg | grep -i "oom\|killed process" | tail -10

# === Memory Usage ===
docker stats
docker stats --no-stream --format "table {{.Name}}\t{{.MemUsage}}\t{{.MemPerc}}"

# === Memory Limits ===
docker inspect <cid> --format='Memory: {{.HostConfig.Memory}}, Swap: {{.HostConfig.MemorySwap}}'

# === Inside Container ===
docker exec <cid> cat /proc/meminfo | head -5
docker exec <cid> cat /proc/1/status | grep VmRSS
docker exec <cid> ps aux --sort=-%mem | head -10

# === cgroups v2 ===
cat /sys/fs/cgroup/system.slice/docker-<id>.scope/memory.current
cat /sys/fs/cgroup/system.slice/docker-<id>.scope/memory.max
cat /sys/fs/cgroup/system.slice/docker-<id>.scope/memory.events
cat /sys/fs/cgroup/system.slice/docker-<id>.scope/memory.pressure

# === cgroups v1 (Legacy) ===
cat /sys/fs/cgroup/memory/docker/<id>/memory.usage_in_bytes
cat /sys/fs/cgroup/memory/docker/<id>/memory.limit_in_bytes

# === Host Memory ===
free -m
vmstat 1 5

# === cgroups version check ===
stat -fc %T /sys/fs/cgroup   # cgroup2fs = v2, tmpfs = v1

Version History & Compatibility

Version Behavior Key Changes
Docker 27+ (2024) Current OOMScoreAdj for services; --oom-kill-disable and --kernel-memory discarded on cgroups v2; containerd default [src7]
Docker 25-26 Stable Improved memory metrics; containerd integration [src1]
Docker 24 Stable cgroups v2 fully supported; BuildKit default [src1, src6]
Docker 23 Stable Compose V2 default [src1]
Docker 20.10 LTS-like cgroups v2 support [src1, src6]
Docker 19.03 Legacy --memory-swap standardized [src1]
cgroups v2 Modern Linux memory.max; memory.events for OOM count; memory.pressure for stall tracking; memory.high soft throttling [src6]
cgroups v1 Legacy Linux memory.limit_in_bytes + memory.oom_control; --oom-kill-disable works [src6]

When to Use / When Not to Use

Use When Don't Use When Use Instead
Exit 137 and OOMKilled=true Exit 137 but OOMKilled=false Check for docker kill or stop timeout
Memory grows over time in stats Memory is constant but app is slow CPU profiling
Java/Node process needs right-sizing Build step fails with OOM Increase Docker Desktop memory
Multiple containers compete for memory Single container with headroom Other issues (CPU, IO, network)
Container keeps restarting with exit 137 Container runs but performance degrades Check cgroups v2 memory.pressure for throttling

Important Caveats

Related Units