The default Kubernetes rolling update strategy sounds safe: bring up new pods before terminating old ones. But if your readiness probe isn't configured correctly, Kubernetes will send traffic to pods that aren't ready yet — and your users see errors.

This tutorial covers the complete configuration that makes rolling updates actually zero-downtime.

Why Traffic Drops During Updates

When Kubernetes updates a deployment, it follows this sequence:

Create a new pod with the updated image
Wait for the pod to pass its readiness probe
Add the pod to the Service endpoints
Terminate an old pod
Repeat until all pods are updated

The problem: if there's no readiness probe, step 2 considers the pod ready the moment the container starts. Your app might need 5–10 seconds to warm up its database connections, load config, or compile templates. During that window, the pod receives traffic it can't handle.

The second problem: when a pod receives SIGTERM (step 4), it might still be handling active requests. If the app exits immediately, those requests fail.

Step 1: Deploy a Sample Application Without Probes

bash

kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-server
spec:
  replicas: 3
  selector:
    matchLabels:
      app: api-server
  template:
    metadata:
      labels:
        app: api-server
    spec:
      containers:
        - name: api
          image: nginx:1.24
          ports:
            - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: api-server
spec:
  selector:
    app: api-server
  ports:
    - port: 80
      targetPort: 80
EOF

Step 2: Add Readiness and Liveness Probes

bash

kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-server
spec:
  replicas: 3
  selector:
    matchLabels:
      app: api-server
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1         # Allow 1 extra pod during update
      maxUnavailable: 0   # Never reduce below desired count
  template:
    metadata:
      labels:
        app: api-server
    spec:
      terminationGracePeriodSeconds: 60
      containers:
        - name: api
          image: nginx:1.24
          ports:
            - containerPort: 80
          readinessProbe:
            httpGet:
              path: /healthz
              port: 80
            initialDelaySeconds: 5
            periodSeconds: 5
            failureThreshold: 3
            successThreshold: 1
          livenessProbe:
            httpGet:
              path: /healthz
              port: 80
            initialDelaySeconds: 15
            periodSeconds: 10
            failureThreshold: 3
          lifecycle:
            preStop:
              exec:
                command: ["sleep", "15"]
EOF

Let's break down each piece:

maxUnavailable: 0 — Kubernetes must never have fewer than the desired replica count available. This forces it to bring up the new pod first (maxSurge: 1) before terminating any old pod.

readinessProbe — Until this passes, the pod does not receive traffic. initialDelaySeconds: 5 gives the app 5 seconds before the first check. failureThreshold: 3 means three consecutive failures before marking the pod unready.

livenessProbe — If this fails, Kubernetes restarts the container. Set initialDelaySeconds higher than your readiness probe — you don't want the liveness probe killing a pod that's still starting up.

terminationGracePeriodSeconds: 60 — Kubernetes waits up to 60 seconds for the pod to exit after sending SIGTERM before force-killing with SIGKILL.

lifecycle.preStop: sleep 15 — When Kubernetes removes a pod from the Service endpoints, it's not instantaneous. The kube-proxy and cloud load balancer take a few seconds to propagate the change. The preStop sleep gives in-flight requests time to complete before SIGTERM is sent to your process.

Step 3: Create a Health Endpoint

For nginx, add a simple health route using a ConfigMap:

bash

kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: nginx-health
data:
  default.conf: |
    server {
        listen 80;
        location /healthz {
            return 200 "ok\n";
            add_header Content-Type text/plain;
        }
        location / {
            return 200 "hello\n";
        }
    }
EOF

kubectl patch deployment api-server --type=json -p='[
  {"op": "add", "path": "/spec/template/spec/volumes", "value": [{"name": "nginx-conf", "configMap": {"name": "nginx-health"}}]},
  {"op": "add", "path": "/spec/template/spec/containers/0/volumeMounts", "value": [{"name": "nginx-conf", "mountPath": "/etc/nginx/conf.d"}]}
]'

Step 4: Verify Zero-Downtime During Update

Send continuous traffic in one terminal:

bash

kubectl run load --rm -it --image=busybox -- sh -c \
  'while true; do wget -qO- http://api-server/healthz && sleep 0.1; done'

In another terminal, trigger an update:

bash

kubectl set image deployment/api-server api=nginx:1.25

Watch the rollout:

bash

kubectl rollout status deployment/api-server
# Waiting for deployment "api-server" rollout to finish: 1 out of 3 new replicas have been updated...
# Waiting for deployment "api-server" rollout to finish: 1 old replicas are pending termination...
# deployment "api-server" successfully rolled out

You should see no errors in the load terminal — every request returns ok.

Step 5: Roll Back if Something Goes Wrong

bash

# View rollout history
kubectl rollout history deployment/api-server

# Roll back to the previous version
kubectl rollout undo deployment/api-server

# Roll back to a specific revision
kubectl rollout undo deployment/api-server --to-revision=2

Verification Checklist

Before declaring a deployment zero-downtime capable, verify:

bash

# Readiness probe is configured
kubectl get deployment api-server -o jsonpath='{.spec.template.spec.containers[0].readinessProbe}'

# maxUnavailable is 0
kubectl get deployment api-server -o jsonpath='{.spec.strategy.rollingUpdate}'
# {"maxSurge":"1","maxUnavailable":"0"}

# terminationGracePeriodSeconds is set
kubectl get deployment api-server -o jsonpath='{.spec.template.spec.terminationGracePeriodSeconds}'
# 60

# preStop hook exists
kubectl get deployment api-server -o jsonpath='{.spec.template.spec.containers[0].lifecycle}'

Common Mistakes

Liveness probe timing out before the app starts — set initialDelaySeconds on the liveness probe to at least 2× your app's startup time. If liveness fires before the app is ready, Kubernetes restart-loops your pod indefinitely.

No preStop sleep — without it, SIGTERM fires while the load balancer still routes traffic to the pod. Even a 5-second sleep is better than nothing.

maxUnavailable: 25% (the default) — the default allows 25% of pods to be unavailable during an update. For a 4-replica deployment, that's 1 pod down while the new one starts. Fine for internal services, not for production APIs.

Zero-Downtime Deployments with Rolling Updates and Readiness Probes

Before you begin

Why Traffic Drops During Updates

Step 1: Deploy a Sample Application Without Probes

Step 2: Add Readiness and Liveness Probes

Step 3: Create a Health Endpoint

Step 4: Verify Zero-Downtime During Update

Step 5: Roll Back if Something Goes Wrong

Verification Checklist

Common Mistakes

Struggling with this in production?