The first rule of running databases in Kubernetes: don't use a Deployment. Deployments are designed for stateless workloads — pods are interchangeable, can be replaced in any order, get random names, and share no storage identity. Databases need the opposite: a stable hostname for each replica, ordered startup so the primary is available before replicas try to connect, and dedicated persistent storage that follows the pod if it's rescheduled.

StatefulSets provide all of this. This tutorial deploys a 3-replica PostgreSQL cluster to demonstrate each guarantee in practice.

What You'll Build

A 3-replica PostgreSQL StatefulSet named postgres with pods postgres-0, postgres-1, and postgres-2. Each pod gets its own PersistentVolumeClaim that persists independently. A headless Service enables stable DNS (postgres-0.postgres.default.svc.cluster.local) so you can always reach a specific replica.

Note: this tutorial focuses on StatefulSet mechanics. For production PostgreSQL with automatic failover, use CloudNativePG or the Zalando Postgres Operator instead of a raw StatefulSet.

Step 1: What Makes StatefulSets Different

Feature	Deployment	StatefulSet
Pod names	Random (pod-abc123)	Ordered (pod-0, pod-1, pod-2)
Startup order	All pods start in parallel	Sequential: pod-0 must be Running and Ready before pod-1 starts
Shutdown order	Any order	Reverse sequential: pod-2 before pod-1 before pod-0
Storage per pod	Shared PVC (or none)	Dedicated PVC per pod via VolumeClaimTemplates
DNS	Single ClusterIP	Stable: pod-0.`<service>`.namespace.svc.cluster.local
Pod identity	Ephemeral — any pod can replace any other	Sticky — pod-0 is always pod-0

The sticky identity is what makes databases work. postgres-0 is always the primary because your application is hardcoded to connect to it. If the pod is rescheduled to a different node, it still comes up as postgres-0 with the same PVC and the same DNS name.

Step 2: Create the Headless Service

A headless Service (one with clusterIP: None) enables per-pod DNS without load balancing. Instead of routing to a single VIP, DNS resolves to individual pod IPs.

bash

1kubectl apply -f - <<EOF
2apiVersion: v1
3kind: Service
4metadata:
5  name: postgres
6  labels:
7    app: postgres
8spec:
9  clusterIP: None
10  selector:
11    app: postgres
12  ports:
13    - name: postgres
14      port: 5432
15      targetPort: 5432
16EOF

With this Service, each pod gets a stable DNS entry:

postgres-0.postgres.default.svc.cluster.local
postgres-1.postgres.default.svc.cluster.local
postgres-2.postgres.default.svc.cluster.local

If you also want a load-balanced endpoint (for read connections), create a second, non-headless Service alongside this one.

Step 3: Create the Secret

Never put database passwords in plain YAML. Create a Secret first:

bash

kubectl create secret generic postgres-secret \
  --from-literal=password='StrongPassword123!'

bash

kubectl get secret postgres-secret -o jsonpath='{.data.password}' | base64 -d; echo
# StrongPassword123!

Step 4: Deploy the StatefulSet

bash

1kubectl apply -f - <<EOF
2apiVersion: apps/v1
3kind: StatefulSet
4metadata:
5  name: postgres
6spec:
7  serviceName: postgres
8  replicas: 3
9  podManagementPolicy: OrderedReady
10  selector:
11    matchLabels:
12      app: postgres
13  template:
14    metadata:
15      labels:
16        app: postgres
17    spec:
18      terminationGracePeriodSeconds: 60
19      containers:
20        - name: postgres
21          image: postgres:16
22          ports:
23            - containerPort: 5432
24          env:
25            - name: POSTGRES_PASSWORD
26              valueFrom:
27                secretKeyRef:
28                  name: postgres-secret
29                  key: password
30            - name: PGDATA
31              value: /var/lib/postgresql/data/pgdata
32          volumeMounts:
33            - name: data
34              mountPath: /var/lib/postgresql/data
35          readinessProbe:
36            exec:
37              command: ["pg_isready", "-U", "postgres"]
38            initialDelaySeconds: 10
39            periodSeconds: 5
40          livenessProbe:
41            exec:
42              command: ["pg_isready", "-U", "postgres"]
43            initialDelaySeconds: 30
44            periodSeconds: 10
45  volumeClaimTemplates:
46    - metadata:
47        name: data
48      spec:
49        accessModes: ["ReadWriteOnce"]
50        storageClassName: standard
51        resources:
52          requests:
53            storage: 10Gi
54EOF

Key fields to understand:

serviceName: postgres — must match the headless Service name. This is what enables the stable DNS per pod.

podManagementPolicy: OrderedReady — the default. Kubernetes waits for postgres-0 to pass its readiness probe before starting postgres-1. Change to Parallel only if your workload doesn't need ordered startup.

terminationGracePeriodSeconds: 60 — gives PostgreSQL time to finish checkpoints and close connections cleanly before SIGKILL. The default (30 seconds) is often too short for a loaded database.

PGDATA: /var/lib/postgresql/data/pgdata — PostgreSQL initializes its data directory on first start. The PostgreSQL Docker image requires PGDATA to be a subdirectory of the mounted volume, not the mount point root itself, because the data directory must be empty at init time. If you mount to /var/lib/postgresql/data and set PGDATA to the same path, init fails if there's a lost+found directory from the filesystem. The /pgdata subdirectory avoids this.

volumeClaimTemplates — this is the StatefulSet-specific feature. Kubernetes creates a PVC for each pod using this template:

data-postgres-0 (10Gi, bound to postgres-0)
data-postgres-1 (10Gi, bound to postgres-1)
data-postgres-2 (10Gi, bound to postgres-2)

These PVCs are not deleted when the StatefulSet is deleted. This is intentional data safety.

Step 5: Verify Ordered Startup

bash

1kubectl get pods -w
2# NAME         READY   STATUS              RESTARTS   AGE
3# postgres-0   0/1     ContainerCreating   0          3s
4# postgres-0   0/1     Running             0          8s
5# postgres-0   1/1     Running             0          18s   ← readiness probe passes
6# postgres-1   0/1     Pending             0          19s
7# postgres-1   0/1     ContainerCreating   0          21s
8# postgres-1   1/1     Running             0          35s   ← postgres-1 now ready
9# postgres-2   0/1     Pending             0          36s
10# postgres-2   1/1     Running             0          52s

postgres-1 doesn't even start until postgres-0 is fully ready. postgres-2 waits for postgres-1. This ordering guarantee is what lets you safely configure postgres-0 as the primary and have replicas join after the primary is up.

Step 6: Verify Stable DNS

Run a debug pod and test DNS resolution:

bash

1kubectl run -it --rm debug \
2  --image=postgres:16 \
3  --restart=Never \
4  -- psql -h postgres-0.postgres.default.svc.cluster.local -U postgres
5# Password for user postgres: StrongPassword123!
6# psql (16.x)
7# Type "help" for help.
8# postgres=#

The hostname postgres-0.postgres.default.svc.cluster.local always resolves to the pod named postgres-0, regardless of which node it's running on or its current IP.

Verify all three DNS entries resolve:

bash

1kubectl run -it --rm debug --image=busybox --restart=Never -- sh
2
3# Inside the debug pod:
4nslookup postgres-0.postgres.default.svc.cluster.local
5# Address: 10.0.2.15
6nslookup postgres-1.postgres.default.svc.cluster.local
7# Address: 10.0.1.22
8nslookup postgres-2.postgres.default.svc.cluster.local
9# Address: 10.0.3.8

Step 7: Scaling

Scale up to 5 replicas:

bash

kubectl scale statefulset postgres --replicas=5
# postgres-3 starts after postgres-2 is ready
# postgres-4 starts after postgres-3 is ready

Scale back down to 2:

bash

kubectl scale statefulset postgres --replicas=2
# postgres-4 is terminated first, then postgres-3, then postgres-2
# Reverse-ordered shutdown ensures replicas are removed before the primary

Scaling down does not delete PVCs. After scaling to 2, data-postgres-2, data-postgres-3, and data-postgres-4 still exist:

bash

1kubectl get pvc
2# NAME             STATUS   VOLUME       CAPACITY
3# data-postgres-0  Bound    pvc-abc...   10Gi
4# data-postgres-1  Bound    pvc-def...   10Gi
5# data-postgres-2  Bound    pvc-ghi...   10Gi   ← still exists (pod terminated, PVC retained by design)
6# data-postgres-3  Bound    pvc-jkl...   10Gi   ← still exists
7# data-postgres-4  Bound    pvc-mno...   10Gi   ← still exists

This is intentional. If you later scale back up to 3, postgres-2 will be reattached to data-postgres-2 with all its data intact. Delete the orphaned PVCs manually if you want to reclaim the storage.

Step 8: Inspect the Running Cluster

Connect to each pod directly:

bash

1# Connect to the primary
2kubectl exec -it postgres-0 -- psql -U postgres
3
4# List databases
5postgres=# \l
6
7# Check version
8postgres=# SELECT version();

Verify each pod's dedicated storage:

bash

kubectl describe pod postgres-0 | grep -A3 "Volumes:"
# Volumes:
#   data:
#     Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
#     ClaimName:  data-postgres-0

Each pod has its own claim. If postgres-0 is rescheduled to a different node, it mounts data-postgres-0 on the new node — the data follows the pod.

Common Mistakes to Avoid

Using a regular (ClusterIP) Service as serviceName — serviceName must reference a headless Service. If you point it at a regular Service, the stable per-pod DNS won't work. The pods will still run, but you lose the DNS guarantee.

PGDATA at the root of the mount — set PGDATA to a subdirectory like /var/lib/postgresql/data/pgdata. Some provisioners create a lost+found directory at the volume root. PostgreSQL refuses to initialize if the data directory is non-empty. The subdirectory avoids this collision entirely.

Deleting the StatefulSet and expecting PVCs to be cleaned up — kubectl delete statefulset postgres does NOT delete the PVCs. This is the correct behavior (your data is preserved). But if you want to fully tear down, you must delete the PVCs separately.

kubectl delete statefulset does not guarantee ordered pod termination — the reverse-sequential shutdown order (pod-N first, pod-0 last) only applies during scale-down operations. When you delete the StatefulSet object directly, Kubernetes does not guarantee termination order. To get ordered shutdown before deletion, scale to 0 first: kubectl scale statefulset postgres --replicas=0, wait for all pods to terminate, then delete the StatefulSet.

podManagementPolicy: Parallel for databases — this policy starts all pods simultaneously, which is useful for stateless workloads that need fast scale-out. For PostgreSQL, the replica containers may try to connect to the primary before it's ready and crash. Use OrderedReady unless you've verified your initialization process can handle a parallel start.

No terminationGracePeriodSeconds — the default is 30 seconds. Under heavy load, PostgreSQL may need longer to flush WAL and complete in-flight transactions. A SIGKILL during a checkpoint can corrupt the data directory. Set it to 60-120 seconds for production.

Cleanup

bash

kubectl delete statefulset postgres
kubectl delete svc postgres
kubectl delete secret postgres-secret
# PVCs are NOT deleted by the above — delete by name:
kubectl delete pvc data-postgres-0 data-postgres-1 data-postgres-2

What's Next

CloudNativePG — Kubernetes-native PostgreSQL operator with automatic failover, backup, and PITR
Kubernetes Storage: PVCs and StorageClasses — if you need a primer on PVCs before this tutorial

Official References

StatefulSets — complete reference for StatefulSet guarantees, pod identity, and deployment/scaling semantics
Headless Services — how headless Services enable per-pod DNS
VolumeClaimTemplates — how Kubernetes creates and manages per-pod PVCs
Running Databases in Kubernetes — when it makes sense to run databases in Kubernetes vs. using managed services

Running StatefulSets: Deploy a PostgreSQL Cluster in Kubernetes

Before you begin

What You'll Build

Step 1: What Makes StatefulSets Different

Step 2: Create the Headless Service

Step 3: Create the Secret

Step 4: Deploy the StatefulSet

Step 5: Verify Ordered Startup

Step 6: Verify Stable DNS

Step 7: Scaling

Step 8: Inspect the Running Cluster

Common Mistakes to Avoid

Cleanup

What's Next

Official References

Struggling with this in production?