How to Debug a Docker Container That Won't Start
How do I debug a Docker container that won't start?
TL;DR
- Bottom line: A Docker container that won't start has either a configuration error (wrong ENTRYPOINT/CMD, missing files, bad permissions) or a runtime crash (application error, missing dependency, out of memory). The exit code tells you the category: 1 = app error, 127 = command not found, 137 = OOM/killed, 139 = segfault, 126 = permission denied.
- Key tool/command:
docker logs <container>shows stdout/stderr.docker inspect <container> --format='{{.State.ExitCode}}'shows the exit code.docker run -it --entrypoint /bin/sh <image>gets you an interactive shell. - Watch out for: Shell form vs exec form in ENTRYPOINT/CMD —
CMD node app.js(shell form) vsCMD ["node", "app.js"](exec form). Shell form can mask exit codes and prevent signal forwarding. - Works with: Docker Engine 20.10+ (including v27, v28, v29), Docker Desktop 4.33+, Podman 4+. Also applies to Docker Compose v2/v5, Kubernetes, and ECS.
Constraints
- Always check
OOMKilledfield (docker inspect --format='{{.State.OOMKilled}}') before assuming exit code 137 means out of memory —docker killanddocker stoptimeouts also produce exit 137. [src3] - Never mix Shell form and Exec form expectations —
CMD node app.js(shell form) wraps in/bin/sh -cand makes the shell PID 1, preventing SIGTERM delivery to the app. [src1] - Alpine-based images use musl libc — binaries compiled against glibc (Ubuntu/Debian) will segfault (exit 139) or fail with "not found" (exit 127) on Alpine. [src5, src7]
docker logsonly captures stdout/stderr — if the application logs to files inside the container,docker logsshows nothing. Reconfigure the app or usedocker execto read log files. [src2]depends_onin Docker Compose (withoutcondition: service_healthy) only waits for the container to start, not for the service to be ready. Always pair withhealthcheck. [src4]- On macOS/Windows, Docker runs inside a Linux VM — memory limits via
--memoryapply within the VM allocation, not host RAM. Check Docker Desktop settings for total VM memory. [src3]
Quick Reference
| # | Exit Code | Signal | Meaning | Most Common Cause | Fix |
|---|---|---|---|---|---|
| 1 | 1 | — | Generic error | Application crash, missing env var, config error | Check docker logs; fix app code or config [src4] |
| 2 | 2 | — | Shell misuse | Bad shell syntax in entrypoint script | Fix shell script syntax [src4] |
| 3 | 126 | — | Permission denied | Entrypoint/script not executable | chmod +x entrypoint.sh in Dockerfile [src5] |
| 4 | 127 | — | Command not found | Binary missing from image, typo in CMD | Check which <cmd> inside container; fix Dockerfile [src7] |
| 5 | 137 | SIGKILL (9) | Out of memory / killed | Container exceeded memory limit; OOM killer | Increase --memory limit or optimize app [src5] |
| 6 | 139 | SIGSEGV (11) | Segmentation fault | Architecture mismatch (ARM vs x86), C library incompatibility | Build for correct platform; check binary compatibility [src5] |
| 7 | 143 | SIGTERM (15) | Graceful termination | docker stop sent SIGTERM; app handled it |
Normal shutdown — not an error [src4] |
| 8 | 255 | — | Exit status out of range | Application returned invalid exit code | Fix app to return 0-125 [src4] |
| 9 | 0 | — | Success | Container completed its task normally | Normal — one-shot container finished [src4] |
| 10 | — | — | No logs, immediate exit | No foreground process; CMD runs in background | Keep process in foreground [src1] |
Decision Tree
START — Container won't start or exits immediately
├── What is the exit code? (docker inspect --format='{{.State.ExitCode}}')
│ ├── Exit 0 → Container finished successfully
│ │ └── If unexpected → CMD terminates immediately → keep in foreground [src1]
│ ├── Exit 1 → Application error
│ │ ├── Check docker logs → find the error message [src2]
│ │ ├── Missing env var? → Add -e VAR=value
│ │ ├── Missing file/config? → Check COPY/ADD in Dockerfile [src1]
│ │ └── Dependency error? → Check package installation
│ ├── Exit 126 → Permission denied
│ │ └── chmod +x on entrypoint script; check USER directive [src5]
│ ├── Exit 127 → Command not found
│ │ ├── Typo in CMD/ENTRYPOINT? → Fix Dockerfile [src7]
│ │ ├── Binary not installed? → Add to RUN install [src7]
│ │ └── Wrong base image? → alpine vs debian (musl vs glibc) [src7]
│ ├── Exit 137 → OOM or killed
│ │ ├── OOMKilled=true → Increase --memory or optimize app [src5]
│ │ └── OOMKilled=false → Manual kill or orchestrator [src5]
│ ├── Exit 139 → Segfault
│ │ ├── Architecture mismatch? → Build for correct platform [src5]
│ │ └── musl vs glibc? → Use compatible binaries [src5]
│ └── No exit code / keeps restarting → Check restart policy [src4]
├── Are there any logs?
│ ├── YES → Read: docker logs --tail 50 [src2]
│ └── NO → Use docker debug or: docker run -it --entrypoint /bin/sh [src6]
└── Multi-stage build issue?
└── Files missing from final stage → Check COPY --from= [src1]
Step-by-Step Guide
1. Check the container status and exit code
Start with docker ps -a to see all containers. [src3, src4]
docker ps -a
docker inspect <cid> --format='{{.State.ExitCode}}'
docker inspect <cid> --format='{{.State.OOMKilled}}'
docker inspect <cid> --format='{{json .State}}' | python -m json.tool
Verify: Use the Quick Reference table to interpret the exit code.
2. Read the container logs
Logs show stdout/stderr from the container's main process. [src2]
docker logs <cid>
docker logs --tail 50 <cid>
docker logs -f <cid>
docker logs -t <cid>
Verify: Look for error messages, stack traces, "file not found", "permission denied".
3. Use Docker Debug for non-invasive inspection
Docker Debug (GA since Docker Desktop 4.33, free for all users) lets you attach a debugging shell to any container — even stopped or minimal ones. [src8]
# Docker Debug (Docker Desktop 4.33+)
docker debug <cid>
# Traditional fallback: override entrypoint
docker run -it --entrypoint /bin/sh <image>
# Debug exited container
docker commit <exited_cid> debug-image
docker run -it --entrypoint /bin/sh debug-image
# Exec into running container
docker exec -it <cid> /bin/sh
Verify: Manually run the CMD command to see the exact error.
4. Inspect the Dockerfile for common mistakes
Review for configuration errors. [src1]
# Use exec form CMD (receives signals properly)
CMD ["node", "server.js"]
# Make scripts executable
COPY --chmod=755 entrypoint.sh /entrypoint.sh
# Run in foreground
CMD ["nginx", "-g", "daemon off;"]
# Multi-stage: copy build output
COPY --from=builder /app/dist ./dist
5. Check environment variables and secrets
Missing env vars are one of the most common causes of exit code 1. [src4]
docker inspect <image> --format='{{json .Config.Env}}'
docker run -e DATABASE_URL=... -e SECRET_KEY=... <image>
docker run --env-file .env <image>
6. Check resource limits
Exit code 137 usually means OOM. [src5]
docker stats <cid>
docker inspect <cid> --format='{{.HostConfig.Memory}}'
docker run --memory=512m --memory-swap=1g <image>
7. Verify networking and dependencies
Containers depending on databases may fail if those aren't ready. [src4]
# docker-compose.yml — proper dependency waiting
services:
app:
depends_on:
db:
condition: service_healthy
db:
image: postgres:16
healthcheck:
test: ["CMD-SHELL", "pg_isready"]
interval: 5s
retries: 5
Code Examples
Comprehensive debug entrypoint script
#!/bin/sh
# Input: Container that fails with unclear errors
# Output: Detailed diagnostic output before main process starts
set -e
echo "=== Container Debug Info ==="
echo "Date: $(date -u)"
echo "Hostname: $(hostname)"
echo "User: $(whoami) (UID=$(id -u), GID=$(id -g))"
echo "Workdir: $(pwd)"
echo "OS: $(cat /etc/os-release 2>/dev/null | grep PRETTY_NAME | cut -d= -f2 || echo 'unknown')"
echo "Arch: $(uname -m)"
echo ""
echo "=== Environment ==="
env | sort | grep -v -E '(PASSWORD|SECRET|KEY|TOKEN)' || true
echo ""
echo "=== File Permissions ==="
ls -la /app/ 2>/dev/null || echo "/app does not exist"
echo ""
echo "=== Connectivity Check ==="
if [ -n "$DATABASE_URL" ]; then
DB_HOST=$(echo "$DATABASE_URL" | sed -E 's|.*@([^:/]+).*|\1|')
DB_PORT=$(echo "$DATABASE_URL" | sed -E 's|.*:([0-9]+)/.*|\1|')
echo "Checking DB: $DB_HOST:$DB_PORT"
nc -zv "$DB_HOST" "$DB_PORT" 2>&1 || echo "Cannot reach database!"
fi
echo ""
echo "=== Starting Application ==="
exec "$@"
Docker health check patterns
# Node.js health check
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
CMD node -e "require('http').get('http://localhost:3000/health', (r) => { process.exit(r.statusCode === 200 ? 0 : 1) })"
# Python/Flask health check
HEALTHCHECK --interval=30s --timeout=5s --start-period=15s --retries=3 \
CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:5000/health')" || exit 1
# Nginx health check
HEALTHCHECK --interval=30s --timeout=3s --retries=3 \
CMD curl -f http://localhost/ || exit 1
Automated container restart diagnostics
# Input: Containers that keep restarting
# Output: Script that captures diagnostic info from crashing containers
import subprocess, json, sys
def diagnose_container(cid):
state = json.loads(subprocess.check_output(
["docker", "inspect", "--format", "{{json .State}}", cid]
))
print(f"=== Container: {cid} ===")
print(f"Status: {state.get('Status')}, Exit: {state.get('ExitCode')}")
print(f"OOMKilled: {state.get('OOMKilled')}")
meanings = {
0: "Success", 1: "App error", 126: "Permission denied",
127: "Command not found", 137: "OOM/killed", 139: "Segfault",
}
print(f"Meaning: {meanings.get(state.get('ExitCode', -1), 'Unknown')}")
logs = subprocess.check_output(
["docker", "logs", "--tail", "30", cid], stderr=subprocess.STDOUT
).decode('utf-8', errors='replace')
print(f"\n=== Last 30 log lines ===\n{logs}")
config = json.loads(subprocess.check_output(
["docker", "inspect", "--format", "{{json .Config}}", cid]
))
print(f"Image: {config.get('Image')}")
print(f"Entrypoint: {config.get('Entrypoint')}")
print(f"Cmd: {config.get('Cmd')}")
if __name__ == "__main__":
diagnose_container(sys.argv[1])
Anti-Patterns
Wrong: Shell form CMD (doesn't receive signals)
# BAD — runs under /bin/sh -c, PID 1 is shell, not your app [src1]
CMD node server.js
Correct: Exec form CMD
# GOOD — app is PID 1, receives signals directly [src1]
CMD ["node", "server.js"]
Wrong: Process daemonizes (container exits immediately)
# BAD — nginx forks to background, container stops [src1]
CMD ["nginx"]
Correct: Run in foreground mode
# GOOD — stays in foreground as PID 1 [src1]
CMD ["nginx", "-g", "daemon off;"]
Wrong: No wait-for-dependencies logic
# BAD — depends_on only waits for container START, not READY [src4]
services:
app:
depends_on:
- db
Correct: Health check + condition
# GOOD — waits until DB is accepting connections [src4]
services:
app:
depends_on:
db:
condition: service_healthy
db:
image: postgres:16
healthcheck:
test: ["CMD-SHELL", "pg_isready"]
interval: 5s
retries: 5
Common Pitfalls
- Shell form vs exec form: Shell form (
CMD node app.js) wraps your command in/bin/sh -c, making the shell PID 1. Your app won't receive SIGTERM ondocker stop. Always use exec form. [src1] - ENTRYPOINT + CMD interaction: When both are set, CMD becomes arguments to ENTRYPOINT. Overriding CMD at runtime replaces CMD, not ENTRYPOINT. [src1]
- Alpine musl vs glibc: Binaries compiled for glibc (Ubuntu/Debian) often segfault (exit
139) on Alpine (musl). Use Alpine-native builds or
*-slimDebian images. [src5, src7] - Multi-stage missing runtime deps: Copying only the binary from the build stage but forgetting shared libraries or CA certificates causes exit 127 or TLS errors. [src1]
docker stoptimeout:docker stopsends SIGTERM, waits 10s, then SIGKILL. If your app always exits 137 on shutdown, it's not handling SIGTERM. [src5]- Line endings (Windows to Linux): Scripts with
\r\ncause "command not found" on Linux. Usedos2unixor configure your editor. [src7] - Compose v5 Go SDK changes: Docker Desktop 4.38+ ships Compose v5 with a new Go SDK. Build configurations that worked with Compose v2 may need updating for Bake-based workflows. [src8]
Diagnostic Commands
# === Container Status ===
docker ps -a
docker inspect <cid> --format='{{.State.ExitCode}}'
docker inspect <cid> --format='{{.State.OOMKilled}}'
docker inspect <cid> --format='{{json .State}}'
# === Logs ===
docker logs <cid>
docker logs --tail 50 <cid>
docker logs -f <cid>
# === Docker Debug (Docker Desktop 4.33+) ===
docker debug <cid>
# === Interactive Debugging (fallback) ===
docker run -it --entrypoint /bin/sh <image>
docker exec -it <cid> /bin/sh
docker commit <cid> debug && docker run -it --entrypoint /bin/sh debug
# === Image Inspection ===
docker inspect <image> --format='{{json .Config.Entrypoint}}'
docker inspect <image> --format='{{json .Config.Cmd}}'
docker history <image>
# === Resources ===
docker stats <cid>
docker system df
# === Restart Policy ===
docker inspect <cid> --format='{{.HostConfig.RestartPolicy.Name}}'
docker update --restart=no <cid>
# === Docker Compose ===
docker compose logs <service>
docker compose config
docker compose ps
Version History & Compatibility
| Version | Behavior | Key Changes |
|---|---|---|
| Docker 29 (2025-11-10) | Current | docker ps --format '{{.HealthStatus}}' placeholder for health state; private time namespace default; runtime support on Windows; local log driver custom attributes; CLI plugin error hooks; API v1.44 minimum [src9] |
| Docker 28 (2025) | Stable | Networking improvements, NRI support; BuildKit v0.23, Buildx v0.25, Compose v2.37 [src8] |
| Docker 27 (2024) | Stable | Containerd image store default; kernel module loading improvements [src8] |
| Docker 25+ (2024) | Stable | docker init for Dockerfile generation [src1] |
| Docker 24 | Stable | BuildKit default; containerd image store option [src1] |
| Docker 23 | Stable | Compose V2 default (docker compose) [src1] |
| Docker 20.10 | LTS-like | cgroups v2 support; --platform flag [src1] |
| Docker Desktop 4.33+ | GA | docker debug free for all users [src8] |
| Docker Desktop 4.38+ | Current | Bake GA, Docker AI Agent, multi-node K8s testing [src8] |
| Podman 4+ | Drop-in replacement | Same CLI; rootless by default [src3] |
Decision Logic
If exit code is 1 and docker logs shows a stack trace
→ Application crashed. Read the trace bottom-up to find the root cause (missing env var, bad config, unhandled exception). Fix the app code or supply the missing input. [src2, src4]
If exit code is 127 and the command appears correct
→ Either binary is missing from the image (typo, not installed) or the base image is wrong (Alpine musl vs glibc binary). Run docker run -it --entrypoint /bin/sh <image> and which <command> to confirm. [src5, src7]
If exit code is 137 and OOMKilled=true
→ Container exceeded its memory limit. Increase --memory (or mem_limit in Compose) or profile and reduce the app's heap usage. On Docker Desktop, also verify the Linux VM has enough RAM allocated. [src3, src5]
If exit code is 137 and OOMKilled=false
→ A docker stop SIGTERM was sent and the app didn't exit within the grace period (default 10s), so SIGKILL was issued. Add a SIGTERM signal handler in the app, or extend the grace period with --stop-timeout. [src1, src5]
If exit code is 139 (segfault) on Apple Silicon, ARM, or after a base-image swap
→ Architecture or libc mismatch. Rebuild with docker build --platform linux/amd64 (or the target arch), or switch from alpine (musl) to *-slim (glibc) base. [src5, src7]
If docker logs is empty but container exits immediately
→ Either the app logs to a file inside the container (reconfigure to stdout/stderr) OR the CMD daemonizes (e.g. nginx, apache2ctl without foreground flag). Re-run with docker debug (Desktop 4.33+) or override the entrypoint to inspect. [src1, src2, src8]
If container starts then exits with code 0 immediately
→ The main process completed and exited (expected for one-shot containers like migrations). If unintended, the CMD is forking to background — keep the process in foreground with daemon off;, -D FOREGROUND, or similar. [src1]
If Docker Compose service starts before its database is ready
→ depends_on alone only waits for container start, not readiness. Add a healthcheck to the dependency and use depends_on: db: condition: service_healthy (Compose v2+). [src4]
If you can't get a shell because the image is distroless/scratch
→ Use docker debug <container> (Docker Desktop 4.33+ GA) — it attaches a debug toolbox to even stopped or shell-less containers without modifying them. Otherwise docker commit the exited container and run an interactive shell against the snapshot. On Docker 29, docker ps --format '{{.HealthStatus}}' also surfaces health state directly. [src6, src8, src9]
When to Use / When Not to Use
| Use When | Don't Use When | Use Instead |
|---|---|---|
| Container exits with non-zero code | Container runs but app behaves wrong | Application-level debugging |
| Container loop-restarts | Image builds fail | docker build --no-cache |
| Container starts but crashes after seconds | Network-level issues between containers | docker network inspect |
| docker compose up fails for one service | Orchestrator-level issues (K8s) | kubectl describe pod |
| Exit code appears but logs are empty | Docker Desktop itself won't start | Docker Desktop troubleshooting docs |
Important Caveats
docker logsonly shows stdout/stderr. If your app logs to a file inside the container,docker logsshows nothing. Configure your app to log to stdout/stderr.- Exit code 137 doesn't always mean OOM — it can also mean
docker killordocker stoptimeout. CheckOOMKilledindocker inspect. docker stopsends SIGTERM first, then SIGKILL after the grace period (default 10s). If your app always exits 137, add a SIGTERM handler.- On macOS/Windows, Docker runs inside a Linux VM. Memory limits are within the VM's allocation. Check Docker Desktop settings.
docker commitcaptures the filesystem but not volumes, network config, or running processes. Useful for debugging, not production.depends_onin Docker Compose only waits for the container to start, not be ready. Usecondition: service_healthywith ahealthcheck.docker debug(Docker Desktop 4.33+) can attach to stopped containers and minimal images (distroless/scratch) — it injects a debug toolbox without modifying the target container. Free for all Docker users since late 2024.