Kubernetes Security Hardening: A Production Checklist
Default Kubernetes settings are not secure defaults — they're permissive defaults designed to get clusters running quickly. Hardening for production means working through the API server, node, workload, network, and supply chain layers systematically. Here's the checklist.

Default Kubernetes is permissive by design. The defaults get you to a running cluster quickly, but they leave the API server wide open to unauthenticated access from within the cluster, pods with root privileges, and services reachable from anywhere. Hardening is the process of closing those defaults systematically.
This post works through the full hardening stack: API server security, node hardening, workload security (PSA), network isolation, RBAC, secrets management, and supply chain controls. Each section includes the audit commands to check your current state and the remediation for common gaps.
Threat Model
Before hardening, understand what you're protecting against:
External attacker (internet): Kubernetes API server and etcd should never be internet-accessible. Cloud load balancers in front of the API server with IP allowlisting are the minimum for managed clusters.
Compromised workload: A pod running malicious code (via supply chain attack, RCE in the application) should be contained — it shouldn't be able to reach other pods, read secrets it doesn't own, or escalate to cluster-admin.
Insider threat / misconfigured access: An overly permissive RBAC binding should not give a developer cluster-admin access. Namespace boundaries should be enforced.
Supply chain attack: A malicious container image or CI/CD pipeline modification shouldn't be deployable to production.
Work through hardening in order: workload isolation first (highest risk reduction), then network, then RBAC, then supply chain.
1. API Server Hardening
Audit Logging
Enable comprehensive audit logging to track who did what:
1# /etc/kubernetes/audit-policy.yaml
2apiVersion: audit.k8s.io/v1
3kind: Policy
4rules:
5 # Log everything at Request level for sensitive resources
6 - level: Request
7 resources:
8 - group: ""
9 resources: ["secrets", "configmaps", "serviceaccounts"]
10 # Log metadata for read operations on most resources
11 - level: Metadata
12 resources:
13 - group: ""
14 resources: ["pods", "services", "deployments"]
15 # Ignore health checks
16 - level: None
17 users: ["system:kube-controller-manager", "system:kube-scheduler"]
18 verbs: ["get", "list", "watch"]
19 resources:
20 - group: ""
21 resources: ["endpoints", "services"]For managed clusters (EKS, GKE, AKS), enable audit logs via the platform:
# EKS: enable audit logs
aws eks update-cluster-config \
--name my-cluster \
--logging '{"clusterLogging":[{"types":["api","audit","authenticator","controllerManager","scheduler"],"enabled":true}]}'Disable Anonymous Authentication
On self-managed clusters, verify anonymous auth is disabled:
# Check current setting
kubectl get cm kube-apiserver-config -n kube-system -o yaml | grep anonymous
# Should NOT see: --anonymous-auth=trueOn managed clusters, anonymous auth to the Kubernetes API is disabled by default.
Restrict API Server Access
1# For EKS: restrict API server to specific IP ranges
2aws eks update-cluster-config \
3 --name my-cluster \
4 --resources-vpc-config \
5 endpointPublicAccess=true,\
6 endpointPrivateAccess=true,\
7 publicAccessCidrs=["203.0.113.0/32"] # Only your IPsThe ideal: endpointPublicAccess=false with endpointPrivateAccess=true — API access only from within the VPC, via a VPN or bastion.
2. Node Hardening
Use Minimal OS Images
Use purpose-built container OS images (Amazon Linux 2023, Bottlerocket, COS on GKE) rather than general-purpose Linux. These images:
- Have no package manager (can't
apt installtools after compromise) - Run a minimal surface area
- Enable seccomp by default
- Have immutable root filesystems (Bottlerocket)
Bottlerocket for EKS:
1# Terraform EKS module
2eks_managed_node_groups = {
3 default = {
4 ami_type = "BOTTLEROCKET_x86_64"
5 # Bottlerocket has no SSH by default — access via SSM or admin container
6 }
7}Node Security Groups
Restrict node-to-node traffic to only what Kubernetes needs:
- Kubelet port (10250) from the API server IP range only
- Container runtime socket never exposed
- No SSH open to the internet (use SSM or a bastion)
IMDSv2 Enforcement
On AWS, enforce IMDSv2 (token-required metadata service) to prevent SSRF attacks from reaching the instance metadata:
# In Terraform node group launch template
metadata_options {
http_endpoint = "enabled"
http_tokens = "required" # Enforces IMDSv2
http_put_response_hop_limit = 1 # Prevents pod-level SSRF to metadata
}With http_put_response_hop_limit = 1, HTTP requests from pods (TTL decremented at the node bridge) cannot reach the instance metadata endpoint — only host-level processes can.
3. Pod Security Admission
PSA enforces security profiles at the namespace level. Production namespaces should run restricted:
1# Audit all namespaces — see what would fail restricted
2kubectl label namespace --list | xargs -I {} \
3 kubectl label namespace {} \
4 pod-security.kubernetes.io/audit=restricted \
5 pod-security.kubernetes.io/audit-version=latest \
6 --overwrite 2>/dev/null
7
8# Check audit events
9kubectl get events -A | grep "policy/audit"Enforce profiles per namespace:
1# Production: strict
2kubectl label namespace production \
3 pod-security.kubernetes.io/enforce=restricted \
4 pod-security.kubernetes.io/enforce-version=latest \
5 pod-security.kubernetes.io/warn=restricted
6
7# Dev/staging: baseline
8kubectl label namespace staging \
9 pod-security.kubernetes.io/enforce=baseline \
10 pod-security.kubernetes.io/warn=restricted
11
12# Platform/system: must stay privileged for DaemonSets with host access
13kubectl label namespace kube-system \
14 pod-security.kubernetes.io/enforce=privilegedrestricted profile requirements (your pods must satisfy all):
1spec:
2 securityContext:
3 runAsNonRoot: true
4 seccompProfile:
5 type: RuntimeDefault # or Localhost
6 containers:
7 - securityContext:
8 allowPrivilegeEscalation: false
9 capabilities:
10 drop: ["ALL"]
11 readOnlyRootFilesystem: true # Recommended but not required by restricted
12 volumes:
13 # Allowed: configMap, emptyDir, secret, downwardAPI, projected, csi, persistentVolumeClaim
14 # Blocked: hostPath, hostNetwork, hostPID, hostIPCFor workloads that can't immediately meet restricted (legacy apps running as root), use baseline as a stepping stone — it blocks the most dangerous configurations (privileged containers, hostPath, host namespaces) without requiring non-root.
4. RBAC Hardening
Audit for Over-Permissioned Bindings
1# Find all ClusterRoleBindings that grant cluster-admin
2kubectl get clusterrolebindings -o json | \
3 jq '.items[] | select(.roleRef.name == "cluster-admin") |
4 {name: .metadata.name, subjects: .subjects}'
5
6# Find ClusterRoleBindings with edit or admin rights (non-platform subjects)
7kubectl get clusterrolebindings -o json | \
8 jq '.items[] | select(.roleRef.name == "edit" or .roleRef.name == "admin") |
9 {name: .metadata.name, subjects: .subjects}'Investigate any cluster-admin binding not attached to the platform team or system components. Every cluster-admin binding is a blast radius that extends to every resource in the cluster.
Disable Default Service Account Token Automount
Default service accounts in every namespace automatically get a token mounted into every pod. This token has the permissions of the default service account — often minimal, but it's an attack surface that shouldn't exist by default:
# Patch the default SA in each namespace to not automount tokens
kubectl patch serviceaccount default -n production \
-p '{"automountServiceAccountToken": false}'
# Verify
kubectl get serviceaccount default -n production -o jsonpath='{.automountServiceAccountToken}'For pods that need API access, create dedicated service accounts with minimal permissions:
1apiVersion: v1
2kind: ServiceAccount
3metadata:
4 name: my-app
5 namespace: production
6automountServiceAccountToken: false # Set at SA level
7
8# Override per-pod only when needed:
9# spec.automountServiceAccountToken: trueLeast-Privilege Service Account Pattern
1# Create a dedicated SA
2apiVersion: v1
3kind: ServiceAccount
4metadata:
5 name: metrics-reader
6 namespace: production
7---
8# Create a minimal Role
9apiVersion: rbac.authorization.k8s.io/v1
10kind: Role
11metadata:
12 name: metrics-reader
13 namespace: production
14rules:
15 - apiGroups: [""]
16 resources: ["pods", "nodes"]
17 verbs: ["get", "list"]
18---
19# Bind them
20apiVersion: rbac.authorization.k8s.io/v1
21kind: RoleBinding
22metadata:
23 name: metrics-reader
24 namespace: production
25subjects:
26 - kind: ServiceAccount
27 name: metrics-reader
28 namespace: production
29roleRef:
30 kind: Role
31 name: metrics-reader
32 apiGroup: rbac.authorization.k8s.ioAudit RBAC with Tools
1# kubectl who-can: who can perform an action?
2# Installed via `kubectl krew install who-can`. Alternative: standalone `kubectl-who-can` binary.
3kubectl who-can create pods -n production
4
5# rakkess: access matrix for a user/SA
6kubectl rakkess --sa production:my-app
7
8# rbac-lookup: what can a subject do?
9kubectl rbac-lookup my-app -n production5. Network Isolation
Default-Deny NetworkPolicy
Apply to every namespace:
1apiVersion: networking.k8s.io/v1
2kind: NetworkPolicy
3metadata:
4 name: default-deny-all
5 namespace: production
6spec:
7 podSelector: {}
8 policyTypes:
9 - Ingress
10 - Egress
11 egress:
12 - to:
13 - namespaceSelector:
14 matchLabels:
15 kubernetes.io/metadata.name: kube-system
16 podSelector:
17 matchLabels:
18 k8s-app: kube-dns
19 ports:
20 - protocol: UDP
21 port: 53
22 - protocol: TCP
23 port: 53Automate this for all new namespaces using Kyverno generate rules so it's applied automatically on namespace creation.
Egress to External IPs
If pods should not reach arbitrary internet destinations, add CIDR-based egress policies:
1# Allow only specific external APIs
2spec:
3 egress:
4 - to:
5 - ipBlock:
6 cidr: 10.0.0.0/8 # Internal VPC
7 - to:
8 - ipBlock:
9 cidr: 0.0.0.0/0
10 except:
11 - 10.0.0.0/8 # Block internal, allow internet?
12 # Be explicit about what external IPs you allowFor zero-trust egress, combine NetworkPolicy with a proxy (Squid, mitmproxy, or a cloud NAT gateway with allowlisting).
6. Secrets Management
Never Use ConfigMaps for Secrets
Kubernetes Secrets provide base64 encoding (not encryption) at the API level, but they're stored encrypted at rest in etcd (if encryption is configured) and have RBAC access control separating them from ConfigMaps.
1# Check if encryption at rest is configured
2kubectl get secrets -A | wc -l # exists, but are they encrypted on disk?
3
4# For EKS: enable envelope encryption
5aws eks associate-encryption-config \
6 --cluster-name my-cluster \
7 --encryption-config '[{"resources":["secrets"],"provider":{"keyArn":"arn:aws:kms:..."}}]'Use External Secrets Operator
ESO pulls secrets from Vault, AWS SSM/Secrets Manager, or GCP Secret Manager and creates Kubernetes Secrets automatically:
1apiVersion: external-secrets.io/v1beta1
2kind: ExternalSecret
3metadata:
4 name: database-credentials
5 namespace: production
6spec:
7 refreshInterval: 1h
8 secretStoreRef:
9 name: aws-secretsmanager
10 kind: ClusterSecretStore
11 target:
12 name: database-credentials
13 creationPolicy: Owner
14 data:
15 - secretKey: DB_PASSWORD
16 remoteRef:
17 key: production/postgres/password
18 property: passwordESO keeps the Kubernetes Secret in sync with the external store. When you rotate the secret in SSM, ESO updates the Kubernetes Secret within the refreshInterval.
See Secrets Management in Kubernetes: Vault vs External Secrets Operator for the full comparison.
7. Supply Chain Security
Image Signing and Verification
Sign images with cosign and enforce signature verification in production:
# Sign image after build
cosign sign --key cosign.key gcr.io/my-project/myapp:v1.0.0
# Kyverno policy: require signed images in production1apiVersion: kyverno.io/v1
2kind: ClusterPolicy
3metadata:
4 name: require-signed-images
5spec:
6 validationFailureAction: Enforce
7 rules:
8 - name: check-image-signature
9 match:
10 any:
11 - resources:
12 kinds: ["Pod"]
13 namespaces: ["production"]
14 verifyImages:
15 - imageReferences: ["gcr.io/my-project/*"]
16 attestors:
17 - count: 1
18 entries:
19 - keys:
20 publicKeys: |-
21 -----BEGIN PUBLIC KEY-----
22 ...
23 -----END PUBLIC KEY-----For CI/CD pipelines, consider keyless signing via Sigstore's Fulcio CA and Rekor transparency log — no private key to manage, identity is bound to the OIDC token of the pipeline runner:
# Keyless signing (GitHub Actions, GitLab CI, etc.)
cosign sign --yes $IMAGE_DIGEST
# Verification
cosign verify --certificate-identity-regexp="https://github.com/your-org/your-repo" \
--certificate-oidc-issuer="https://token.actions.githubusercontent.com" \
$IMAGE_DIGESTKeyless signing is the recommended approach for automated pipelines in 2026 — the signing identity is cryptographically tied to the pipeline's OIDC token rather than a long-lived private key.
SBOM and Vulnerability Scanning
1# Generate SBOM
2syft gcr.io/my-project/myapp:v1.0.0 -o spdx-json > sbom.json
3
4# Scan for CVEs
5trivy image gcr.io/my-project/myapp:v1.0.0 \
6 --severity CRITICAL,HIGH \
7 --exit-code 1 # Fail pipeline on critical CVEsIntegrate scanning into CI/CD as a gate before push to production registry.
Admission Control with Kyverno
Kyverno policies enforce security requirements at admission time — before workloads are created:
1# Require non-root containers
2apiVersion: kyverno.io/v1
3kind: ClusterPolicy
4metadata:
5 name: require-non-root
6spec:
7 validationFailureAction: Enforce
8 rules:
9 - name: check-run-as-non-root
10 match:
11 any:
12 - resources:
13 kinds: ["Pod"]
14 namespaces: ["production"]
15 validate:
16 message: "Containers must run as non-root"
17 pattern:
18 spec:
19 =(securityContext):
20 =(runAsNonRoot): true
21 containers:
22 - =(securityContext):
23 =(runAsNonRoot): true
24
25# Require resource requests on all containers
26---
27apiVersion: kyverno.io/v1
28kind: ClusterPolicy
29metadata:
30 name: require-resource-requests
31spec:
32 validationFailureAction: Enforce
33 rules:
34 - name: check-cpu-request
35 match:
36 any:
37 - resources:
38 kinds: ["Pod"]
39 validate:
40 message: "CPU and memory requests are required"
41 pattern:
42 spec:
43 containers:
44 - resources:
45 requests:
46 cpu: "?*"
47 memory: "?*"8. CIS Benchmark Scanning
Run the CIS Kubernetes Benchmark against your cluster to get a scored baseline:
# kube-bench: runs CIS benchmark checks locally
kubectl apply -f https://raw.githubusercontent.com/aquasecurity/kube-bench/main/job.yaml
kubectl logs job/kube-bench
# Or run directly (requires node access)
kube-bench run --targets node,masterkube-bench output scores each CIS control as PASS, FAIL, or WARN with remediation instructions. Prioritise FAIL items in the master and etcd sections — these are the highest-severity controls.
For managed clusters (EKS, GKE, AKS), many control plane controls are FAIL because the platform doesn't expose the relevant flags. This is expected — managed clusters have different security models. Focus on the node and policies sections for actionable findings.
Security Hardening Priority Order
If you're starting from scratch, work in this order for maximum risk reduction per unit of effort:
- PSA on production namespaces (
restrictedorbaseline) — prevents the most common container escape vectors - Default-deny NetworkPolicy on all namespaces — contains blast radius of a compromised pod
- Disable default SA token automount — removes unnecessary API access from all pods
- RBAC audit — identify and revoke cluster-admin bindings not owned by platform team
- IMDSv2 enforcement (AWS) — blocks pod-level credential theft via SSRF
- Image signing + Kyverno verification — prevents tampered images from reaching production
- ESO for secrets — removes plaintext secrets from ConfigMaps and git history
- Audit logging to SIEM — enables post-incident forensics
Frequently Asked Questions
What's the difference between PSA and Kyverno for pod security?
PSA (Pod Security Admission) is built into Kubernetes and enforces the three built-in profiles (privileged/baseline/restricted). Kyverno is a policy engine that can enforce arbitrary policies including PSA-equivalent rules plus custom ones (require labels, block specific images, require resource requests). Use both: PSA for the standard baseline, Kyverno for organisation-specific policies.
Is Falco necessary for security?
Falco provides runtime threat detection — it alerts when a pod unexpectedly opens a network connection, reads /etc/passwd, or runs bash. PSA and NetworkPolicy are preventive; Falco is detective. For organisations with a security operations team that can respond to Falco alerts, it adds real value. For teams without dedicated security response capacity, the alert noise may outweigh the benefit.
How do I handle privileged containers that my workload legitimately needs?
Some workloads (CNI plugins, CSI drivers, logging agents, monitoring agents) legitimately need elevated privileges. Use namespace-scoped PSA exemptions for the kube-system and platform namespaces where these run (enforce=privileged), and enforce restricted strictly on application namespaces. Don't relax application namespace security for platform DaemonSets.
What's the fastest win if I have zero hardening right now?
Apply default-deny NetworkPolicy to production namespaces and label them with PSA baseline enforcement. These two changes take under an hour, require no application code changes (baseline is permissive enough for most apps), and dramatically reduce the blast radius of a compromised pod.
For RBAC patterns and tooling, see Kubernetes RBAC in Practice. For PSA migration from PodSecurityPolicy, see Kubernetes Pod Security Admission: The PodSecurityPolicy Replacement Guide. For supply chain security, see Supply Chain Security Tools for Kubernetes. For container supply chain security covering image signing, Cosign, and Kyverno admission enforcement, see Container Image Security: Supply Chain from Build to Production. For NetworkPolicy patterns that implement the network security layer, see Kubernetes NetworkPolicy: Zero-Trust Networking for Multi-Team Clusters.
Hardening a Kubernetes cluster for a security audit or compliance requirement? Talk to us at Coding Protocols — we help platform teams build security baselines that satisfy auditors without blocking developers.


