Kubernetes StatefulSet database deployment

- Bottom line: Use StatefulSets when you need stable Pod identities and persistent storage for databases on Kubernetes; use a database operator (CloudNativePG, Percona) for production to automate failover, backups, and replication.

how to run a database on Kubernetes

- Bottom line: Use StatefulSets when you need stable Pod identities and persistent storage for databases on Kubernetes; use a database operator (CloudNativePG, Percona) for production to automate failover, backups, and replication.

Kubernetes persistent volume database

- Bottom line: Use StatefulSets when you need stable Pod identities and persistent storage for databases on Kubernetes; use a database operator (CloudNativePG, Percona) for production to automate failover, backups, and replication.

k8s StatefulSet PostgreSQL MySQL

- Bottom line: Use StatefulSets when you need stable Pod identities and persistent storage for databases on Kubernetes; use a database operator (CloudNativePG, Percona) for production to automate failover, backups, and replication.

Kubernetes StatefulSet for Databases

Kubernetes reference: StatefulSet for databases

TL;DR

Bottom line: Use StatefulSets when you need stable Pod identities and persistent storage for databases on Kubernetes; use a database operator (CloudNativePG, Percona) for production to automate failover, backups, and replication.
Key tool/command: kubectl apply -f statefulset.yaml with volumeClaimTemplates and a headless Service (clusterIP: None)
Watch out for: volumeClaimTemplates cannot be modified after creation, and PVCs are never automatically deleted on scale-down -- orphaned PVCs silently accumulate cost.
Works with: Kubernetes 1.26+ (StatefulSet GA since 1.9; start ordinals GA in 1.31). All major database engines.

Constraints

StatefulSets REQUIRE a headless Service (clusterIP: None) -- without it, Pods have no stable DNS names for replication
volumeClaimTemplates CANNOT be modified after StatefulSet creation -- plan storage size and class before deploying
PersistentVolumeClaims are NOT automatically deleted when StatefulSet is scaled down or deleted -- orphaned PVCs accumulate cost
NEVER use Deployment for databases that require stable network identity or ordered startup (e.g., primary/replica topology)
Database backups MUST use the database's native dump tool (pg_dump, mysqldump, mongodump) -- filesystem-level PVC snapshots can produce corrupt backups

Quick Reference

StatefulSet vs Deployment for Databases

Feature	StatefulSet	Deployment
Pod naming	Predictable: `{name}-0`, `{name}-1`, ...	Random: `{name}-{hash}`
Network identity	Stable DNS via headless Service	Ephemeral, load-balanced
Storage	Per-Pod PVC via volumeClaimTemplates	Shared or no persistent storage
Scaling order	Sequential (0, 1, 2...)	Parallel
Deletion order	Reverse sequential (2, 1, 0)	Parallel
Rolling update	Reverse ordinal (highest first)	Configurable (maxSurge)
Use for databases	Yes -- primary/replica, stable identity	Only stateless caches
Headless Service required	Yes	No
PVC cleanup on delete	Manual	N/A

Database Operator Comparison

Operator	Database	License	HA	Auto-Backup	Monitoring	CNCF
CloudNativePG	PostgreSQL	Apache 2.0	Streaming replication + auto-failover	Object store (S3, GCS, Azure) + PITR	Prometheus exporter	Sandbox
Percona Operator PG	PostgreSQL	Apache 2.0	Patroni-based HA	S3/GCS/Azure + PITR	PMM integration	No
Percona Operator MySQL	MySQL (PXC)	Apache 2.0	Galera multi-primary	S3/GCS + PITR	PMM integration	No
Percona Operator MongoDB	MongoDB	Apache 2.0	Replica set auto-failover	S3/GCS + PITR	PMM integration	No
MongoDB Community	MongoDB	Apache 2.0	Replica set	Manual	Basic	No
Zalando PG Operator	PostgreSQL	MIT	Patroni HA	WAL-G to S3/GCS	Built-in	No
Crunchy PGO	PostgreSQL	Apache 2.0	Patroni HA	pgBackRest	Prometheus	No

StatefulSet Pod DNS Pattern

Component	DNS Format	Example
Pod	`{pod}.{service}.{namespace}.svc.cluster.local`	`postgres-0.postgres-hl.default.svc.cluster.local`
Service (headless)	`{service}.{namespace}.svc.cluster.local`	`postgres-hl.default.svc.cluster.local`
Primary (convention)	`{statefulset}-0.{service}.{ns}.svc.cluster.local`	`postgres-0.postgres-hl.default.svc.cluster.local`

Decision Tree

START: Do you need a database on Kubernetes?
├── Can you use a managed database (RDS, Cloud SQL, Azure DB)?
│   ├── YES → Use managed database. Simplest, most reliable option.
│   └── NO (on-prem, air-gapped, cost, or data sovereignty) ↓
├── Team has 3+ engineers with Kubernetes experience?
│   ├── YES → Use a database operator (CloudNativePG, Percona, Crunchy PGO)
│   └── NO ↓
├── Database is PostgreSQL?
│   ├── YES → CloudNativePG (simplest operator, CNCF, strong community)
│   └── NO ↓
├── Database is MySQL?
│   ├── YES → Percona Operator for MySQL (Galera-based HA)
│   └── NO ↓
├── Database is MongoDB?
│   ├── YES → Percona Operator for MongoDB (replica set + sharding)
│   └── NO ↓
├── Need fine-grained control or learning exercise?
│   ├── YES → Manual StatefulSet (see Step-by-Step Guide)
│   └── NO ↓
└── DEFAULT → CloudNativePG for PostgreSQL, Percona for MySQL/MongoDB.

Step-by-Step Guide

1. Create a headless Service

The headless Service provides stable DNS names for each Pod. Without it, StatefulSet Pods cannot be individually addressed. [src1]

apiVersion: v1
kind: Service
metadata:
  name: postgres-hl
spec:
  clusterIP: None          # Headless
  selector:
    app: postgres
  ports:
    - port: 5432
      name: postgres

Verify: kubectl get svc postgres-hl → should show CLUSTER-IP: None

2. Create a Secret for database credentials

Never store passwords in plain text in StatefulSet YAML. Use Kubernetes Secrets. [src5]

apiVersion: v1
kind: Secret
metadata:
  name: postgres-secret
type: Opaque
stringData:
  POSTGRES_PASSWORD: "changeme-use-strong-password"
  POSTGRES_USER: "postgres"
  POSTGRES_DB: "appdb"

Verify: kubectl get secret postgres-secret → Opaque type with 3 data keys

3. Deploy the StatefulSet with volumeClaimTemplates

Creates a PostgreSQL instance with persistent storage. Each replica gets its own PVC. [src1] [src5]

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres
spec:
  serviceName: postgres-hl
  replicas: 1
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
        - name: postgres
          image: postgres:16-alpine
          # ... (see full YAML in Code Examples)
  volumeClaimTemplates:
    - metadata:
        name: postgres-data
      spec:
        accessModes: ["ReadWriteOnce"]
        storageClassName: standard
        resources:
          requests:
            storage: 10Gi

Verify: kubectl get statefulset postgres → READY: 1/1; kubectl get pvc → postgres-data-postgres-0 Bound

4. Configure init containers for replica bootstrap

For primary/replica setups, use init containers to determine the Pod's role based on its ordinal index. [src2]

initContainers:
  - name: init-role
    image: postgres:16-alpine
    command: ['sh', '-c']
    args:
      - |
        ORDINAL=$(echo $HOSTNAME | rev | cut -d'-' -f1 | rev)
        if [ "$ORDINAL" = "0" ]; then
          echo "primary" > /config/role
        else
          echo "replica" > /config/role
        fi

Verify: kubectl exec postgres-0 -- cat /config/role → primary

5. Set up a backup CronJob

Use the database's native backup tool in a CronJob, not filesystem snapshots. [src7]

apiVersion: batch/v1
kind: CronJob
metadata:
  name: postgres-backup
spec:
  schedule: "0 2 * * *"   # Daily at 2 AM
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: backup
              image: postgres:16-alpine
              command: ['sh', '-c']
              args:
                - pg_dump -h postgres-0.postgres-hl -U postgres -Fc appdb > /backup/$(date +%Y%m%d).dump

Verify: kubectl get cronjob postgres-backup → schedule 0 2 * * *

6. Deploy with a database operator (recommended for production)

For production PostgreSQL, install CloudNativePG and declare a Cluster resource. [src3]

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: app-db
spec:
  instances: 3
  storage:
    size: 20Gi
    storageClass: standard
  backup:
    barmanObjectStore:
      destinationPath: s3://my-bucket/backups

Verify: kubectl get cluster app-db → Phase: Cluster in healthy state

Code Examples

YAML: Complete PostgreSQL StatefulSet

Full script: postgres-statefulset.yaml (48 lines)

# Input:  Kubernetes cluster with dynamic storage provisioner
# Output: Single PostgreSQL instance with persistent storage

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres
spec:
  serviceName: postgres-hl
  replicas: 1
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
        - name: postgres
          image: postgres:16-alpine
          ports:
            - containerPort: 5432
          envFrom:
            - secretRef:
                name: postgres-secret
          env:
            - name: PGDATA
              value: /var/lib/postgresql/data/pgdata
          volumeMounts:
            - name: postgres-data
              mountPath: /var/lib/postgresql/data
          resources:
            requests:
              cpu: 250m
              memory: 512Mi
            limits:
              cpu: "1"
              memory: 2Gi
          readinessProbe:
            exec:
              command: ["pg_isready", "-U", "postgres"]
            initialDelaySeconds: 5
            periodSeconds: 10
          livenessProbe:
            exec:
              command: ["pg_isready", "-U", "postgres"]
            initialDelaySeconds: 30
            periodSeconds: 15
  volumeClaimTemplates:
    - metadata:
        name: postgres-data
      spec:
        accessModes: ["ReadWriteOnce"]
        storageClassName: standard
        resources:
          requests:
            storage: 10Gi

YAML: CloudNativePG Cluster with Automated Backups

# Input:  Kubernetes cluster with CloudNativePG operator installed
# Output: 3-node PostgreSQL HA cluster with S3 backups + PITR

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: production-db
spec:
  instances: 3
  postgresql:
    parameters:
      max_connections: "200"
      shared_buffers: "512MB"
  storage:
    size: 50Gi
    storageClass: fast-ssd
  backup:
    retentionPolicy: "30d"
    barmanObjectStore:
      destinationPath: s3://backups/production-db

Anti-Patterns

Wrong: Using a Deployment for a database with replication

# BAD -- Deployment gives random Pod names, no stable DNS for replication
apiVersion: apps/v1
kind: Deployment
metadata:
  name: postgres
spec:
  replicas: 3
  template:
    spec:
      containers:
        - name: postgres
          image: postgres:16-alpine
      volumes:
        - name: data
          emptyDir: {}  # Data lost on Pod restart!

Correct: Using a StatefulSet with persistent volumes

# GOOD -- StatefulSet provides stable identity and persistent storage
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres
spec:
  serviceName: postgres-hl
  replicas: 3
  volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 10Gi

Wrong: Using a regular ClusterIP Service for StatefulSet

# BAD -- ClusterIP Service load-balances; replicas can't address each other
apiVersion: v1
kind: Service
metadata:
  name: postgres
spec:
  selector:
    app: postgres
  ports:
    - port: 5432
  # Missing clusterIP: None -- creates a load-balanced Service

Correct: Using a headless Service

# GOOD -- headless Service creates individual DNS records per Pod
apiVersion: v1
kind: Service
metadata:
  name: postgres-hl
spec:
  clusterIP: None
  selector:
    app: postgres
  ports:
    - port: 5432

Wrong: Backing up a database by copying PVC files

# BAD -- copying data directory while database is running produces corrupt backup
kubectl cp postgres-0:/var/lib/postgresql/data ./backup/
# WAL files may be inconsistent; you'll get a corrupted restore

Correct: Using the database's native dump tool

# GOOD -- pg_dump creates a consistent logical backup
kubectl exec postgres-0 -- pg_dump -U postgres -Fc appdb > backup.dump

# GOOD -- for physical backup, use pg_basebackup
kubectl exec postgres-0 -- pg_basebackup -D /tmp/backup -Ft -z -P

Wrong: Hardcoding passwords in StatefulSet YAML

# BAD -- credentials visible in version control and kubectl describe
env:
  - name: POSTGRES_PASSWORD
    value: "mysecretpassword"

Correct: Using Kubernetes Secrets

# GOOD -- reference Secret for credentials
envFrom:
  - secretRef:
      name: postgres-secret

Common Pitfalls

volumeClaimTemplates immutability: You cannot resize or change StorageClass after StatefulSet creation. Fix: Create a new StatefulSet with updated templates, migrate data, then delete the old one. [src1]
Orphaned PVCs after scale-down: Scaling from 3 to 1 replica leaves PVCs for Pod-1 and Pod-2 allocated and billed. Fix: Manually delete PVCs: kubectl delete pvc postgres-data-postgres-1 postgres-data-postgres-2. [src1]
PGDATA subdirectory requirement: PostgreSQL requires PGDATA to be a subdirectory, not the volume root. Fix: Set PGDATA=/var/lib/postgresql/data/pgdata and mount volume at /var/lib/postgresql/data. [src5]
Missing readiness probes: Without readiness probes, Kubernetes considers Pods ready immediately, causing replication to start before initialization. Fix: Add pg_isready or mysqladmin ping as readiness probe. [src2]
StorageClass mismatch: Using ReadWriteMany when the storage provider only supports ReadWriteOnce causes PVC binding failures. Fix: Check kubectl get storageclass and match access modes. [src5]
Pod stuck in Pending after node failure: If a node fails and the PV uses local storage, the Pod cannot reschedule. Fix: Use network-attached storage (EBS, Ceph, GCE PD) for production databases. [src7]
Missing Pod Disruption Budgets: Without a PDB, cluster upgrades can take down all database Pods simultaneously. Fix: Create a PDB with minAvailable: 1 for database StatefulSets. [src1]
Operator version drift: Running an outdated operator version misses critical security patches and bug fixes. Fix: Pin operator versions and include them in your upgrade runbook. [src4]

Diagnostic Commands

# List StatefulSet status and ready replicas
kubectl get statefulset postgres -o wide

# Check PVC status (should all be Bound)
kubectl get pvc -l app=postgres

# View Pod DNS resolution from inside the cluster
kubectl run -it --rm debug --image=busybox -- nslookup postgres-0.postgres-hl

# Check database readiness from Pod
kubectl exec postgres-0 -- pg_isready -U postgres

# View StatefulSet events (useful for debugging stuck rollouts)
kubectl describe statefulset postgres

# Check storage class availability
kubectl get storageclass

# View Pod logs for database startup errors
kubectl logs postgres-0 --tail=50

# Check PV reclaim policy (should be Retain for production)
kubectl get pv -o custom-columns=NAME:.metadata.name,RECLAIM:.spec.persistentVolumeReclaimPolicy,STATUS:.status.phase

# Check operator status (CloudNativePG)
kubectl get cluster -A
kubectl get pods -n cnpg-system

Version History & Compatibility

Kubernetes StatefulSet Features

K8s Version	Feature	Status	Notes
1.9+	StatefulSet API	GA (apps/v1)	Stable since 2017
1.24+	PVC auto-deletion	Beta	StatefulSetAutoDeletePVC feature gate
1.25+	minReadySeconds	GA	Pod must be ready for N seconds
1.26+	Start ordinal	Beta	Custom starting ordinal index
1.27+	PVC resize for StatefulSets	Stable	Expand PVCs without recreation
1.31+	Start ordinal	GA	Custom ordinal ranges fully stable
1.31+	maxUnavailable for RollingUpdate	Beta	Parallel rolling updates

Database Operators

Operator	Latest Version	Min K8s	Database Support
CloudNativePG	1.25.x	1.27+	PostgreSQL 12-17
Percona Operator PG	2.5.x	1.27+	PostgreSQL 13-17
Percona Operator MySQL	1.16.x	1.26+	MySQL 8.0 (PXC)
Percona Operator MongoDB	1.18.x	1.26+	MongoDB 6.0-8.0

When to Use / When Not to Use

Use When	Don't Use When	Use Instead
Database requires stable Pod identity for primary/replica topology	Application is stateless (web servers, API gateways)	Deployment
Need ordered startup (primary before replicas)	Database can tolerate random Pod names	Deployment with PVC
Per-Pod persistent storage is required	All replicas share the same data volume	Deployment + single PVC
Running on-prem or air-gapped (no managed DB option)	Cloud provider offers managed database	Managed database service
Dev/test environment needing realistic database setup	Production DB needing automated failover and PITR	Database operator (CloudNativePG, Percona)
Learning Kubernetes stateful workloads	Team lacks Kubernetes operational expertise	Managed database service

Important Caveats

StatefulSet PVC auto-deletion (persistentVolumeClaimRetentionPolicy) is Beta as of Kubernetes 1.31 -- do not rely on it in production without testing; the default behavior is to retain PVCs
CloudNativePG does NOT use StatefulSets internally -- it manages Pods directly for finer control over failover; understanding StatefulSets is still essential for other databases and concepts
volumeClaimTemplates storage size can be expanded (since K8s 1.27) only if the StorageClass has allowVolumeExpansion: true -- shrinking is never supported
Running databases on Kubernetes adds operational complexity; for teams without dedicated platform engineers, managed database services are usually more cost-effective
Pod Disruption Budgets (PDB) are critical for database StatefulSets -- without them, cluster upgrades may simultaneously evict all database Pods
StatefulSet rolling updates proceed in reverse ordinal order (highest to lowest), which means replicas update before the primary -- verify your replication topology handles this correctly