Kubernetes

Zero-Downtime Deployments with Rolling Updates and Readiness Probes

Beginner25 min to complete8 min read

Most Kubernetes deployments drop traffic during updates because readiness probes are misconfigured or missing. This tutorial shows you the exact configuration that eliminates downtime — and how to verify it.

Before you begin

  • kubectl configured against a running cluster
  • Basic understanding of Kubernetes Pods and Deployments
Kubernetes
Deployments
Rolling Updates
Readiness Probes
DevOps

The default Kubernetes rolling update strategy sounds safe: bring up new pods before terminating old ones. But if your readiness probe isn't configured correctly, Kubernetes will send traffic to pods that aren't ready yet — and your users see errors.

This tutorial covers the complete configuration that makes rolling updates actually zero-downtime.

Why Traffic Drops During Updates

When Kubernetes updates a deployment, it follows this sequence:

  1. Create a new pod with the updated image
  2. Wait for the pod to pass its readiness probe
  3. Add the pod to the Service endpoints
  4. Terminate an old pod
  5. Repeat until all pods are updated

The problem: if there's no readiness probe, step 2 considers the pod ready the moment the container starts. Your app might need 5–10 seconds to warm up its database connections, load config, or compile templates. During that window, the pod receives traffic it can't handle.

The second problem: when a pod receives SIGTERM (step 4), it might still be handling active requests. If the app exits immediately, those requests fail.

Step 1: Deploy a Sample Application Without Probes

bash
kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-server
spec:
  replicas: 3
  selector:
    matchLabels:
      app: api-server
  template:
    metadata:
      labels:
        app: api-server
    spec:
      containers:
        - name: api
          image: nginx:1.24
          ports:
            - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: api-server
spec:
  selector:
    app: api-server
  ports:
    - port: 80
      targetPort: 80
EOF

Step 2: Add Readiness and Liveness Probes

bash
kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-server
spec:
  replicas: 3
  selector:
    matchLabels:
      app: api-server
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1         # Allow 1 extra pod during update
      maxUnavailable: 0   # Never reduce below desired count
  template:
    metadata:
      labels:
        app: api-server
    spec:
      terminationGracePeriodSeconds: 60
      containers:
        - name: api
          image: nginx:1.24
          ports:
            - containerPort: 80
          readinessProbe:
            httpGet:
              path: /healthz
              port: 80
            initialDelaySeconds: 5
            periodSeconds: 5
            failureThreshold: 3
            successThreshold: 1
          livenessProbe:
            httpGet:
              path: /healthz
              port: 80
            initialDelaySeconds: 15
            periodSeconds: 10
            failureThreshold: 3
          lifecycle:
            preStop:
              exec:
                command: ["sleep", "15"]
EOF

Let's break down each piece:

maxUnavailable: 0 — Kubernetes must never have fewer than the desired replica count available. This forces it to bring up the new pod first (maxSurge: 1) before terminating any old pod.

readinessProbe — Until this passes, the pod does not receive traffic. initialDelaySeconds: 5 gives the app 5 seconds before the first check. failureThreshold: 3 means three consecutive failures before marking the pod unready.

livenessProbe — If this fails, Kubernetes restarts the container. Set initialDelaySeconds higher than your readiness probe — you don't want the liveness probe killing a pod that's still starting up.

terminationGracePeriodSeconds: 60 — Kubernetes waits up to 60 seconds for the pod to exit after sending SIGTERM before force-killing with SIGKILL.

lifecycle.preStop: sleep 15 — When Kubernetes removes a pod from the Service endpoints, it's not instantaneous. The kube-proxy and cloud load balancer take a few seconds to propagate the change. The preStop sleep gives in-flight requests time to complete before SIGTERM is sent to your process.

Step 3: Create a Health Endpoint

For nginx, add a simple health route using a ConfigMap:

bash
kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: nginx-health
data:
  default.conf: |
    server {
        listen 80;
        location /healthz {
            return 200 "ok\n";
            add_header Content-Type text/plain;
        }
        location / {
            return 200 "hello\n";
        }
    }
EOF

kubectl patch deployment api-server --type=json -p='[
  {"op": "add", "path": "/spec/template/spec/volumes", "value": [{"name": "nginx-conf", "configMap": {"name": "nginx-health"}}]},
  {"op": "add", "path": "/spec/template/spec/containers/0/volumeMounts", "value": [{"name": "nginx-conf", "mountPath": "/etc/nginx/conf.d"}]}
]'

Step 4: Verify Zero-Downtime During Update

Send continuous traffic in one terminal:

bash
kubectl run load --rm -it --image=busybox -- sh -c \
  'while true; do wget -qO- http://api-server/healthz && sleep 0.1; done'

In another terminal, trigger an update:

bash
kubectl set image deployment/api-server api=nginx:1.25

Watch the rollout:

bash
kubectl rollout status deployment/api-server
# Waiting for deployment "api-server" rollout to finish: 1 out of 3 new replicas have been updated...
# Waiting for deployment "api-server" rollout to finish: 1 old replicas are pending termination...
# deployment "api-server" successfully rolled out

You should see no errors in the load terminal — every request returns ok.

Step 5: Roll Back if Something Goes Wrong

bash
# View rollout history
kubectl rollout history deployment/api-server

# Roll back to the previous version
kubectl rollout undo deployment/api-server

# Roll back to a specific revision
kubectl rollout undo deployment/api-server --to-revision=2

Verification Checklist

Before declaring a deployment zero-downtime capable, verify:

bash
# Readiness probe is configured
kubectl get deployment api-server -o jsonpath='{.spec.template.spec.containers[0].readinessProbe}'

# maxUnavailable is 0
kubectl get deployment api-server -o jsonpath='{.spec.strategy.rollingUpdate}'
# {"maxSurge":"1","maxUnavailable":"0"}

# terminationGracePeriodSeconds is set
kubectl get deployment api-server -o jsonpath='{.spec.template.spec.terminationGracePeriodSeconds}'
# 60

# preStop hook exists
kubectl get deployment api-server -o jsonpath='{.spec.template.spec.containers[0].lifecycle}'

Common Mistakes

Liveness probe timing out before the app starts — set initialDelaySeconds on the liveness probe to at least 2× your app's startup time. If liveness fires before the app is ready, Kubernetes restart-loops your pod indefinitely.

No preStop sleep — without it, SIGTERM fires while the load balancer still routes traffic to the pod. Even a 5-second sleep is better than nothing.

maxUnavailable: 25% (the default) — the default allows 25% of pods to be unavailable during an update. For a 4-replica deployment, that's 1 pod down while the new one starts. Fine for internal services, not for production APIs.

We built Podscape to simplify Kubernetes workflows like this — logs, events, and cluster state in one interface, without switching tools.

Struggling with this in production?

We help teams fix these exact issues. Our engineers have deployed these patterns across production environments at scale.