How to Resolve Terraform State Conflicts and Lock Issues
How do I resolve Terraform state conflicts and lock issues?
TL;DR
- Bottom line: Most Terraform state lock errors are stale locks from interrupted processes — identify the lock holder, confirm no operation is running, then
terraform force-unlock <LOCK_ID>. - Key tool/command:
terraform force-unlock <LOCK_ID> - Watch out for: Force-unlocking while another process is actually running — this causes state corruption and potential infrastructure drift.
- Works with: Terraform >= 0.12, OpenTofu >= 1.6. S3+DynamoDB, S3 native locking (>= 1.10), Terraform Cloud, Azure Blob, GCS.
Constraints
- Never run
terraform force-unlockwhile another terraform process is actively running — this causes state corruption from concurrent writes - Always verify no other operations are in progress (check CI/CD pipelines, teammates, scheduled jobs) before force-unlocking
terraform state pushrequires local serial > remote serial — pushing an older serial overwrites newer state and loses changes- Local state files (.tfstate on disk) cannot be unlocked by
terraform force-unlock— only remote backends support this command - DynamoDB-based locking is deprecated as of Terraform 1.10 — migrate to native S3 locking (
use_lockfile = true) before it is removed
Quick Reference
| # | Cause | Likelihood | Signature | Fix |
|---|---|---|---|---|
| 1 | Stale lock from crashed/interrupted process | ~40% | Error acquiring the state lock + old timestamp |
terraform force-unlock <LOCK_ID> |
| 2 | CI/CD pipeline killed mid-apply | ~20% | Lock info shows CI runner hostname | Kill orphan process, then terraform force-unlock <LOCK_ID> |
| 3 | Concurrent terraform runs (no CI mutex) | ~12% | Two pipelines/users running simultaneously | Wait for first to complete, add CI concurrency controls |
| 4 | DynamoDB lock table permission denied | ~8% | ConditionalCheckFailedException |
Fix IAM: add dynamodb:PutItem, dynamodb:DeleteItem |
| 5 | State serial mismatch after state push | ~6% | Error uploading state: Conflict + MD5 mismatch |
Pull state, increment serial, push back |
| 6 | Backend migration lock conflict | ~5% | Lock error during terraform init -migrate-state |
Complete/abort previous migration, force-unlock, retry |
| 7 | S3 lock file not cleaned up | ~3% | .tflock file persists in S3 bucket |
Delete *.tflock from S3 manually |
| 8 | Network interruption during apply | ~2% | Process exited uncleanly, lock remains | Verify process dead, terraform force-unlock |
| 9 | Terraform Cloud workspace locked | ~2% | Workspace shows "Locked" in UI | Unlock via UI or API POST /actions/unlock |
| 10 | Azure Blob lease not released | ~1% | Azure lease ID in error | az storage blob lease break |
| 11 | Lock ID mismatch on unlock attempt | ~1% | lock ID does not match existing lock ID |
Get correct lock ID from backend, retry |
Decision Tree
START
├── Error contains "Error acquiring the state lock"?
│ ├── YES → Is another terraform process running (CI/CD, teammates)?
│ │ ├── YES → Wait for it to finish, or add -lock-timeout=10m
│ │ └── NO → Extract LOCK_ID from error → terraform force-unlock <LOCK_ID>
│ └── NO ↓
├── Error contains "ConditionalCheckFailedException"?
│ ├── YES → Check DynamoDB IAM permissions (PutItem, DeleteItem, GetItem)
│ │ ├── Permissions OK → Stale DynamoDB lock → delete item from lock table
│ │ └── Permissions missing → Fix IAM policy
│ └── NO ↓
├── Error contains "Error uploading state: Conflict" or "MD5 hash"?
│ ├── YES → Serial mismatch → pull state, increment serial by 2, push back
│ └── NO ↓
├── Error contains "lock ID does not match"?
│ ├── YES → Extract correct lock ID from backend → retry force-unlock
│ └── NO ↓
├── Using Terraform Cloud and workspace shows "Locked"?
│ ├── YES → Unlock via UI or API POST /actions/unlock
│ └── NO ↓
└── DEFAULT → Run terraform init -reconfigure, then retry operation
Step-by-Step Guide
1. Identify the lock holder and error type
Read the full error message. Terraform prints the lock ID, who holds it, when it was acquired, and what operation they were running. [src1]
# The error output looks like this:
# Error: Error acquiring the state lock
#
# Lock Info:
# ID: a1b2c3d4-e5f6-7890-abcd-ef1234567890
# Path: terraform-state/production/terraform.tfstate
# Operation: OperationTypeApply
# Who: runner@ci-node-42
# Version: 1.9.5
# Created: 2026-02-23 10:15:32.123456 +0000 UTC
Verify: Check the Who field and Created timestamp — if the lock is hours/days old, it is stale.
2. Confirm no active operations
Before force-unlocking, ensure no other process is running terraform against this state file. [src6]
# Check local processes
ps aux | grep terraform
# Check CI/CD pipelines (GitHub Actions example)
gh run list --workflow=terraform.yml --status=in_progress
# Check Terraform Cloud (if applicable)
curl -s -H "Authorization: Bearer $TFC_TOKEN" \
"https://app.terraform.io/api/v2/workspaces/$WORKSPACE_ID/runs?filter[status]=applying" \
| jq '.data[] | {id: .id, status: .attributes.status}'
Verify: No terraform processes listed, no CI runs in progress.
3. Force-unlock the state
Use the lock ID from the error message. [src2]
# Standard force-unlock (will prompt for confirmation)
terraform force-unlock a1b2c3d4-e5f6-7890-abcd-ef1234567890
# Skip confirmation prompt (CI/CD usage)
terraform force-unlock -force a1b2c3d4-e5f6-7890-abcd-ef1234567890
Verify: terraform plan runs without lock errors.
4. Handle serial mismatch conflicts
If you get an MD5 hash or serial conflict when pushing state, increment the serial. [src7]
# Pull current remote state
terraform state pull > current-state.tfstate
# Increment serial by 2 (must exceed remote serial)
jq '.serial = .serial + 2' current-state.tfstate > fixed-state.tfstate
# Push the fixed state
terraform state push fixed-state.tfstate
# Clean up
rm current-state.tfstate fixed-state.tfstate
Verify: terraform plan shows expected changes (or no changes).
5. Delete stale DynamoDB lock (last resort)
If force-unlock fails, delete the lock item directly from DynamoDB. [src3]
# Scan for stale locks
aws dynamodb scan --table-name terraform-state-locks \
--filter-expression "attribute_exists(LockID)" \
--projection-expression "LockID,Info"
# Delete the stale lock entry
aws dynamodb delete-item --table-name terraform-state-locks \
--key '{"LockID": {"S": "terraform-state/production/terraform.tfstate-md5"}}'
Verify: terraform plan acquires lock successfully.
6. Recover from state corruption
If state is corrupted, restore from S3 versioning. [src5]
# List state file versions
aws s3api list-object-versions --bucket my-terraform-state \
--prefix "production/terraform.tfstate" \
--query 'Versions[0:5].{VersionId:VersionId,LastModified:LastModified,Size:Size}'
# Download last known good version
aws s3api get-object --bucket my-terraform-state \
--key "production/terraform.tfstate" \
--version-id "abc123def456" restored-state.tfstate
# Increment serial and push
jq '.serial = .serial + 2' restored-state.tfstate > push-state.tfstate
terraform state push push-state.tfstate
Verify: terraform plan shows expected infrastructure state.
Code Examples
HCL: S3 backend with native locking (recommended)
# Input: Backend config in terraform block
# Output: Remote state with S3-native locking (no DynamoDB needed)
terraform {
backend "s3" {
bucket = "mycompany-terraform-state"
key = "production/network/terraform.tfstate"
region = "us-east-1"
encrypt = true
use_lockfile = true # Native S3 locking (Terraform >= 1.10)
}
}
HCL: S3 backend with DynamoDB locking (legacy)
# Input: Backend config with DynamoDB lock table
# Output: Remote state with DynamoDB-based locking (deprecated)
terraform {
backend "s3" {
bucket = "mycompany-terraform-state"
key = "production/network/terraform.tfstate"
region = "us-east-1"
encrypt = true
dynamodb_table = "terraform-state-locks" # Deprecated in 1.10
}
}
Bash: CI/CD safe apply with lock timeout and retry
#!/usr/bin/env bash
# Input: Terraform workspace directory
# Output: Safe apply with lock timeout, retry, and auto-unlock on stale lock
set -euo pipefail
MAX_RETRIES=3
LOCK_TIMEOUT="5m"
for attempt in $(seq 1 $MAX_RETRIES); do
echo "Attempt $attempt of $MAX_RETRIES"
if terraform apply -auto-approve -lock-timeout="$LOCK_TIMEOUT" 2>&1; then
echo "Apply succeeded"; exit 0
else
LOCK_ID=$(terraform plan 2>&1 | grep -oP 'ID:\s+\K[a-f0-9-]+' || true)
if [ -n "$LOCK_ID" ]; then
echo "Force-unlocking stale lock: $LOCK_ID"
terraform force-unlock -force "$LOCK_ID"
else
echo "Non-lock error"; exit 1
fi
fi
done
echo "Failed after $MAX_RETRIES attempts"; exit 1
YAML: GitHub Actions concurrency control
# Input: GitHub Actions workflow
# Output: Single-concurrent terraform apply per environment
name: Terraform Apply
on:
push:
branches: [main]
concurrency:
group: terraform-${{ github.ref }}-production
cancel-in-progress: false # Never cancel running apply
jobs:
apply:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
- run: terraform init
- run: terraform apply -auto-approve -lock-timeout=10m
Anti-Patterns
Wrong: Force-unlocking without checking for active operations
# BAD — blindly force-unlocking can corrupt state if another process is running
terraform force-unlock -force $(terraform plan 2>&1 | grep -oP 'ID:\s+\K[a-f0-9-]+')
Correct: Verify no active operations, then unlock
# GOOD — check for running processes first
ps aux | grep "[t]erraform"
gh run list --workflow=terraform.yml --status=in_progress
# Only after confirming nothing is running:
terraform force-unlock a1b2c3d4-e5f6-7890-abcd-ef1234567890
Wrong: Disabling state locking entirely
# BAD — no locking creates race conditions and state corruption
terraform {
backend "s3" {
bucket = "my-state"
key = "terraform.tfstate"
region = "us-east-1"
}
}
# Running with -lock=false makes it worse
Correct: Enable locking and handle contention with timeouts
# GOOD — always enable locking, use timeouts for contention
terraform {
backend "s3" {
bucket = "my-state"
key = "terraform.tfstate"
region = "us-east-1"
encrypt = true
use_lockfile = true
}
}
# Run with: terraform apply -lock-timeout=10m
Wrong: Pushing state without checking serial
# BAD — pushing without serial check can overwrite newer state
terraform state pull > backup.tfstate
# ... make edits ...
terraform state push backup.tfstate
# Error: serial not greater than current
Correct: Pull latest, increment serial, then push
# GOOD — always increment serial before pushing
terraform state pull > current.tfstate
SERIAL=$(jq '.serial' current.tfstate)
jq ".serial = $SERIAL + 2" current.tfstate > updated.tfstate
terraform state push updated.tfstate
Wrong: Storing state files in Git
# BAD — state contains secrets; Git has no locking
git add terraform.tfstate
git commit -m "Update state"
# Merge conflicts on .tfstate are unrecoverable
Correct: Use remote backend with encryption and versioning
# GOOD — remote backend handles versioning and locking
terraform {
backend "s3" {
bucket = "my-state"
key = "terraform.tfstate"
region = "us-east-1"
encrypt = true
use_lockfile = true
}
}
# S3 versioning provides history; Git tracks only .tf files
Common Pitfalls
- Forgetting -lock-timeout in CI/CD: Default lock timeout is 0s — terraform immediately fails if lock is held. Fix: Always add
-lock-timeout=5mto plan/apply. [src6] - DynamoDB table with wrong key name: Partition key must be exactly
LockID(capital L, capital ID) with type String. Fix:aws dynamodb describe-table --table-name terraform-state-locks. [src3] - S3 bucket without versioning: Without versioning, corrupted state requires
terraform importfor every resource. Fix:aws s3api put-bucket-versioning --bucket my-state --versioning-configuration Status=Enabled. [src5] - Multiple backends referencing same state key: Creates implicit coupling and lock contention. Fix: Use unique keys per stack:
key = "env/${var.env}/component/terraform.tfstate". [src4] - Running terraform from wrong directory: Each directory is a separate root module with its own state. Fix: Verify with
terraform state listbefore modifying. [src1] - Terraform Cloud auto-lock on VCS runs: VCS-triggered runs lock workspace; CLI runs fail. Fix: Unlock via UI or API. [src7]
- Ignoring lock info timestamps: A 5-minute-old lock may be active; a 5-hour-old lock is stale. Fix: Always check
Createdfield before force-unlocking. [src2]
Diagnostic Commands
# Check who holds the state lock
terraform plan 2>&1 | grep -A 10 "Lock Info"
# Inspect current state serial and lineage
terraform state pull | jq '{serial, lineage, terraform_version}'
# List all resources in current state
terraform state list
# Check DynamoDB lock table entries (S3 backend)
aws dynamodb scan --table-name terraform-state-locks \
--projection-expression "LockID,Info"
# Check S3 lock files (native S3 locking)
aws s3 ls s3://my-state-bucket/ --recursive | grep ".tflock"
# List S3 state file versions for recovery
aws s3api list-object-versions --bucket my-state-bucket \
--prefix "production/terraform.tfstate" \
--query 'Versions[0:5]'
# Verify state file is valid JSON
terraform state pull | python3 -c "import sys,json; json.load(sys.stdin); print('Valid')"
# Check Terraform Cloud workspace lock status
curl -s -H "Authorization: Bearer $TFC_TOKEN" \
"https://app.terraform.io/api/v2/workspaces/$WS_ID" \
| jq '.data.attributes.locked'
# Check for hung terraform processes
ps aux | grep "[t]erraform"
# Azure Blob lease status
az storage blob show --container-name tfstate --name terraform.tfstate \
--account-name mystorageaccount --query properties.lease
Version History & Compatibility
| Version | Status | Breaking Changes | Migration Notes |
|---|---|---|---|
| Terraform >= 1.10 | Current | Native S3 locking (use_lockfile), DynamoDB deprecated |
Add use_lockfile = true, keep dynamodb_table during transition |
| Terraform 1.0-1.9 | Supported | None for locking | S3 backend requires dynamodb_table for locking |
| Terraform 0.14-0.15 | Legacy | Backend config changes | terraform init -reconfigure after upgrades |
| Terraform 0.12-0.13 | EOL | HCL2 migration | State format compatible, backends unchanged |
| OpenTofu >= 1.6 | Current (fork) | Same state format as Terraform | force-unlock, state commands work identically |
When to Use / When Not to Use
| Use When | Don't Use When | Use Instead |
|---|---|---|
| "Error acquiring the state lock" appears | State drift without lock errors | terraform plan -refresh-only |
| CI/CD pipeline stuck waiting for lock | Provider authentication failures | Provider-specific documentation |
terraform force-unlock needed |
Plan shows unexpected resource changes | terraform import or state inspection |
| State serial/MD5 mismatch errors | Terraform version upgrade issues | Terraform upgrade guide |
| Recovering from corrupted state file | Infrastructure wrong but state is correct | terraform apply to converge |
Important Caveats
- Native S3 locking (
use_lockfile) stores a.tflockfile alongside state — IAM policies must grants3:PutObjectands3:DeleteObjecton*.tflockpaths terraform force-unlockbehavior varies by backend — Azure Blob Storage requires breaking the lease separately- State files contain sensitive data (passwords, tokens, private keys) in plaintext — always encrypt at rest and restrict access via IAM
- After any manual state manipulation (
state push,state mv,state rm), always runterraform planto verify before applying - Terraform Cloud serial conflicts often require API-level state manipulation — CLI
state pushmay fail due to version mismatches