Kubernetes Admission Webhooks: Validating and Mutating Workloads
Admission webhooks are the enforcement layer in Kubernetes — they intercept API requests before objects are persisted and can reject, modify, or audit them. Understanding how webhooks work, how to write them, and how not to break your cluster with a misconfigured one is essential platform engineering knowledge.

Every time you run kubectl apply, the API server processes the request through a series of admission plugins before persisting the object. Admission webhooks are user-defined plugins in this chain — you register an HTTPS endpoint that the API server calls with the admission request, and your response determines whether the request is accepted, rejected, or modified.
Webhooks are how Kyverno and OPA Gatekeeper work. They're also how cert-manager injects CA bundles, how Istio injects sidecar containers, and how dozens of other Kubernetes tools extend the API server's behaviour without modifying the API server itself.
Understanding the webhook model is important even if you never write one — because when a webhook fails, it can block all pod creation in your cluster, and you need to know how to diagnose and recover.
The Admission Chain
When the API server receives a request (e.g., kubectl apply for a Pod), it runs through:
AuthN → AuthZ → Mutating Admission → Object Schema Validation → Validating Admission → Persist to etcd
Mutating admission webhooks run first. They can modify the incoming object — injecting sidecars, adding labels, setting default values, adding annotations.
Validating admission webhooks run after mutation. They can only accept or reject — they cannot modify the object. If a validating webhook rejects the request, the object is not persisted.
Order matters: if you have both a mutating webhook that injects a label and a validating webhook that requires that label, the mutating webhook must run first (which it does — mutation precedes validation in the chain).
ValidatingWebhookConfiguration
1apiVersion: admissionregistration.k8s.io/v1
2kind: ValidatingWebhookConfiguration
3metadata:
4 name: require-labels
5webhooks:
6 - name: require-labels.example.com
7 admissionReviewVersions: ["v1"]
8 sideEffects: None # Required for webhooks that don't write to cluster
9 rules:
10 - apiGroups: [""]
11 apiVersions: ["v1"]
12 operations: ["CREATE", "UPDATE"]
13 resources: ["pods"]
14 namespaceSelector:
15 matchExpressions:
16 - key: kubernetes.io/metadata.name
17 operator: NotIn
18 values: ["kube-system", "kube-public", "cert-manager"]
19 clientConfig:
20 service:
21 name: admission-webhook
22 namespace: platform
23 port: 443
24 path: /validate
25 caBundle: <base64-encoded-CA-cert>
26 failurePolicy: Fail # Reject the request if the webhook is unavailable
27 timeoutSeconds: 10failurePolicy: Fail means if your webhook is unreachable, all matching API requests are rejected. A webhook deployment that goes down takes down pod creation in all matched namespaces. This is the behaviour you want in production (security is preserved) but it's dangerous during webhook outages.
failurePolicy: Ignore means webhook unavailability is transparent — requests proceed as if the webhook approved them. Safer operationally but security controls become best-effort.
namespaceSelector scopes the webhook to specific namespaces. Always exclude kube-system and other platform namespaces to prevent the webhook from blocking system components if it fails.
timeoutSeconds defaults to 10 seconds. If your webhook doesn't respond within this window, the request fails (with failurePolicy: Fail) or succeeds (with failurePolicy: Ignore). Keep webhook response times well under 5 seconds.
MutatingWebhookConfiguration
1apiVersion: admissionregistration.k8s.io/v1
2kind: MutatingWebhookConfiguration
3metadata:
4 name: inject-labels
5webhooks:
6 - name: inject-labels.example.com
7 admissionReviewVersions: ["v1"]
8 sideEffects: None
9 rules:
10 - apiGroups: ["apps"]
11 apiVersions: ["v1"]
12 operations: ["CREATE"]
13 resources: ["deployments"]
14 clientConfig:
15 service:
16 name: admission-webhook
17 namespace: platform
18 port: 443
19 path: /mutate
20 caBundle: <base64-encoded-CA-cert>
21 failurePolicy: Fail
22 reinvocationPolicy: IfNeeded # Re-run if another webhook modified the objectreinvocationPolicy: IfNeeded re-runs the mutating webhook if another mutating webhook modified the object after this one ran. Use this when your mutation depends on the final state of the object, not just the initial state.
Writing a Webhook in Go
A webhook is an HTTPS server that handles POST /validate or POST /mutate requests:
1package main
2
3import (
4 "encoding/json"
5 "fmt"
6 "net/http"
7
8 admissionv1 "k8s.io/api/admission/v1"
9 corev1 "k8s.io/api/core/v1"
10 metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
11)
12
13func validatePod(w http.ResponseWriter, r *http.Request) {
14 // Decode the AdmissionReview request
15 var admissionReview admissionv1.AdmissionReview
16 if err := json.NewDecoder(r.Body).Decode(&admissionReview); err != nil {
17 http.Error(w, fmt.Sprintf("decode error: %v", err), http.StatusBadRequest)
18 return
19 }
20
21 request := admissionReview.Request
22
23 // Decode the Pod from the request
24 var pod corev1.Pod
25 if err := json.Unmarshal(request.Object.Raw, &pod); err != nil {
26 http.Error(w, fmt.Sprintf("unmarshal error: %v", err), http.StatusBadRequest)
27 return
28 }
29
30 // Validation logic: require "team" label
31 response := &admissionv1.AdmissionResponse{
32 UID: request.UID,
33 }
34
35 if _, ok := pod.Labels["team"]; !ok {
36 response.Allowed = false
37 response.Result = &metav1.Status{
38 Code: 400,
39 Message: "Pod must have a 'team' label",
40 }
41 } else {
42 response.Allowed = true
43 }
44
45 // Send the response
46 admissionReview.Response = response
47 if err := json.NewEncoder(w).Encode(admissionReview); err != nil {
48 http.Error(w, fmt.Sprintf("encode error: %v", err), http.StatusInternalServerError)
49 }
50}
51
52func main() {
53 mux := http.NewServeMux()
54 mux.HandleFunc("/validate", validatePod)
55 mux.HandleFunc("/healthz", func(w http.ResponseWriter, _ *http.Request) {
56 w.WriteHeader(http.StatusOK)
57 })
58 // TLS required — Kubernetes API server only calls HTTPS webhook endpoints
59 http.ListenAndServeTLS(":8443", "/certs/tls.crt", "/certs/tls.key", mux)
60}Mutating Webhook: Adding a JSON Patch
Mutating webhooks return a JSON Patch describing the modifications:
1func mutatePod(w http.ResponseWriter, r *http.Request) {
2 // ... decode AdmissionReview same as above
3
4 // Build a JSON Patch to add a label
5 patch := []map[string]interface{}{
6 {
7 "op": "add",
8 "path": "/metadata/labels/injected-by",
9 "value": "admission-webhook",
10 },
11 }
12 // If the labels map doesn't exist, use "add" to create it first:
13 // {"op": "add", "path": "/metadata/labels", "value": {}}
14
15 patchBytes, _ := json.Marshal(patch)
16 patchType := admissionv1.PatchTypeJSONPatch
17
18 response := &admissionv1.AdmissionResponse{
19 UID: request.UID,
20 Allowed: true,
21 Patch: patchBytes,
22 PatchType: &patchType,
23 }
24 // ... send response
25}TLS: The Operational Complexity
Webhooks must be served over HTTPS with a certificate trusted by the Kubernetes API server. This is the main operational hurdle for custom webhooks.
Option 1: cert-manager (recommended)
1# cert-manager Certificate for the webhook
2apiVersion: cert-manager.io/v1
3kind: Certificate
4metadata:
5 name: admission-webhook-cert
6 namespace: platform
7spec:
8 secretName: admission-webhook-tls
9 dnsNames:
10 - admission-webhook.platform.svc
11 - admission-webhook.platform.svc.cluster.local
12 issuerRef:
13 name: cluster-issuer
14 kind: ClusterIssuerThen reference the cert in the webhook configuration using the CA injection annotation:
metadata:
annotations:
cert-manager.io/inject-ca-from: platform/admission-webhook-cert
webhooks:
- clientConfig:
caBundle: "" # Populated automatically by cert-managercert-manager automatically populates caBundle and rotates the certificate.
Option 2: kubebuilder / controller-runtime
kubebuilder generates the webhook TLS infrastructure and registers the webhook automatically:
1//+kubebuilder:webhook:path=/validate-v1-pod,mutating=false,failurePolicy=fail,groups="",versions=v1,resources=pods,verbs=create;update,name=vpod.kb.io,admissionReviewVersions=v1,sideEffects=None
2
3func (r *PodValidator) ValidateCreate(ctx context.Context, obj runtime.Object) (admission.Warnings, error) {
4 pod := obj.(*corev1.Pod)
5 if _, ok := pod.Labels["team"]; !ok {
6 return nil, fmt.Errorf("pod must have a 'team' label")
7 }
8 return nil, nil
9}kubebuilder generates the ValidatingWebhookConfiguration and handles the cert-manager integration.
Kyverno and OPA Gatekeeper: No-Code Policy Engines
For common policy patterns, using a policy engine is better than writing a custom webhook:
Kyverno expresses policies as YAML, not code:
1apiVersion: kyverno.io/v1
2kind: ClusterPolicy
3metadata:
4 name: require-team-label
5spec:
6 validationFailureAction: Enforce
7 rules:
8 - name: check-team-label
9 match:
10 any:
11 - resources:
12 kinds: ["Pod"]
13 validate:
14 message: "Pod must have a 'team' label"
15 pattern:
16 metadata:
17 labels:
18 team: "?*" # Must exist and be non-emptyOPA Gatekeeper uses Rego policies with a ConstraintTemplate + Constraint pattern:
1# ConstraintTemplate defines the Rego logic
2apiVersion: templates.gatekeeper.sh/v1
3kind: ConstraintTemplate
4metadata:
5 name: requireteamlabel
6spec:
7 crd:
8 spec:
9 names:
10 kind: RequireTeamLabel
11 targets:
12 - target: admission.k8s.gatekeeper.sh
13 rego: |
14 package requireteamlabel
15 violation[{"msg": msg}] {
16 not input.review.object.metadata.labels.team
17 msg := "Pod must have a 'team' label"
18 }
19---
20# Constraint applies the template
21apiVersion: constraints.gatekeeper.sh/v1beta1
22kind: RequireTeamLabel
23metadata:
24 name: require-team-label
25spec:
26 match:
27 kinds:
28 - apiGroups: [""]
29 kinds: ["Pod"]When to use each:
- Kyverno: Policy-as-YAML, lower learning curve,
generateandmutaterules in addition to validation, better for platform teams without Rego experience - OPA Gatekeeper: Rego is more expressive for complex cross-resource policies, better if your team already uses OPA elsewhere
- Custom webhook: Only when policy engines can't express your logic — usually complex external API calls, custom signature verification, or non-Kubernetes validation
Failure Modes and Recovery
The Cluster Lockout Scenario
A failurePolicy: Fail webhook with a misconfigured namespaceSelector that includes kube-system can block all pod creation, including system components. Recovery:
1# Delete the broken webhook configuration
2kubectl delete validatingwebhookconfiguration my-webhook
3# or
4kubectl delete mutatingwebhookconfiguration my-webhook
5
6# If kubectl itself is blocked (webhook applies to all namespaces and is timing out):
7# Access the API server directly from a control plane node
8# For managed clusters, this often requires:
9# EKS: SSM into a node and use the in-cluster config
10# GKE: Use the GKE-specific breakglass accessPrevention:
- Always include
kube-system,kube-public, and the webhook's own namespace innamespaceSelectorexclusions - Use
failurePolicy: Ignoreduring initial rollout; switch toFailafter validating stability - Test webhook availability with a liveness probe and alert on unavailability
Webhook Timeout Causing Pod Scheduling Failures
If your webhook is slow (cold start, external API call, resource-constrained), requests time out and (with failurePolicy: Fail) pods fail to create. Profile your webhook:
# Check webhook latency from kube-apiserver metrics
kubectl get --raw /metrics | grep apiserver_admission_webhook_admission_durationWebhook responses should be well under 1 second. Avoid synchronous external API calls in the webhook hot path — cache responses, or use an async pattern where the webhook always allows and a separate controller validates.
Frequently Asked Questions
Can a webhook call external APIs?
Yes, but carefully. A synchronous call to an external API (authorization service, policy database) adds that API's latency to every Kubernetes API request in the matched scope. If the external API has a 200ms P99, every pod creation adds 200ms. If the external API goes down with failurePolicy: Fail, pod creation stops. Cache responses aggressively and design for external API unavailability.
How do I test a webhook locally?
Use kind or a local cluster, deploy your webhook with a self-signed cert:
# Generate certs for local testing
openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem \
-days 365 -nodes -subj "/CN=admission-webhook.platform.svc"
# Run webhook locally and use port-forwarding or nodePortFor unit testing the webhook handler logic directly, mock the AdmissionReview request — no cluster required.
What's the difference between a webhook and a controller?
A webhook is called synchronously during an API request — it can accept/reject/modify before the object is persisted. It runs in the write path.
A controller watches for existing objects and reconciles state asynchronously — it can't reject the initial creation but can delete, modify, or react to objects after they're created.
Use a webhook when you need to prevent non-compliant objects from being created. Use a controller when you need to react to objects that already exist or enforce eventual consistency.
For Kyverno policies that use webhooks under the hood, see Kubernetes Security Hardening: A Production Checklist. For building full operators that combine webhooks and controllers, see Kubernetes Operators: Building Custom Controllers with CRDs.
Implementing admission control for a multi-team platform? Talk to us at Coding Protocols — we help platform teams design webhook-based policy enforcement that doesn't become a cluster availability risk.


