RBAC Misconfigurations That Break Production Kubernetes Clusters

I've done a lot of cluster audits. The pattern is almost always the same: RBAC was set up correctly at the start, then gradually degraded. A debugging session left a permissive binding that nobody removed. A Helm chart shipped with a ClusterRole that nobody read. The default service account quietly accumulated permissions over time.

Nobody notices until something breaks or, worse, until an auditor or attacker does.

This post covers the eight misconfigurations I see most often — not theoretical risks, but things that have caused real production incidents or failed real compliance reviews.

1. Wildcard verbs on sensitive resources

yaml

rules:
  - apiGroups: [""]
    resources: ["secrets"]
    verbs: ["*"]

This grants create, delete, update, patch, list, watch, and get on secrets. The intention was usually "the app needs to read secrets." The result is a workload that can overwrite or exfiltrate every secret in the namespace.

Why it's dangerous: list + watch on secrets is particularly bad. A compromised pod with these permissions can dump every secret in the namespace in a single API call. list on secrets returns full secret values, not just metadata.

Fix: Be explicit about verbs. If an app only reads secrets, say so:

yaml

rules:
  - apiGroups: [""]
    resources: ["secrets"]
    verbs: ["get"]

If it needs to read a specific secret, use resourceNames:

yaml

rules:
  - apiGroups: [""]
    resources: ["secrets"]
    verbs: ["get"]
    resourceNames: ["my-app-credentials"]

2. Using ClusterRoleBinding when you need RoleBinding

yaml

1kind: ClusterRoleBinding
2roleRef:
3  kind: ClusterRole
4  name: pod-reader
5subjects:
6  - kind: ServiceAccount
7    name: my-app
8    namespace: production

This grants pod-reader permissions across every namespace in the cluster, not just production. If pod-reader includes list pods, your app can list pods in kube-system, monitoring, and every tenant namespace.

Why it happens: ClusterRoleBinding is easier to write — you don't have to think about which namespace. The docs don't shout loudly enough that a RoleBinding achieves the same effect scoped correctly.

The rule: A ClusterRole can be bound with either a ClusterRoleBinding (cluster-wide) or a RoleBinding (namespace-scoped). Default to RoleBinding unless you have an explicit reason to go cluster-wide.

yaml

1kind: RoleBinding
2metadata:
3  namespace: production
4roleRef:
5  kind: ClusterRole
6  name: pod-reader
7subjects:
8  - kind: ServiceAccount
9    name: my-app
10    namespace: production

3. The default service account over-permission trap

When you don't specify a serviceAccountName in your Pod spec, Kubernetes assigns the default service account in that namespace. If someone bound a permissive role to default — even temporarily, even "just to test" — every pod in that namespace inherits those permissions.

bash

# Find all bindings to the default SA
kubectl get rolebindings,clusterrolebindings -A -o json \
  | jq '.items[] | select(
      .subjects[]? |
      (.kind=="ServiceAccount" and .name=="default")
    ) | .metadata.name'

Fix: Audit and clean up bindings to default. Then opt out of automounting the token for workloads that don't call the API:

yaml

spec:
  automountServiceAccountToken: false

Or set this at the service account level to make it the default for all pods using that SA:

yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  name: default
automountServiceAccountToken: false

4. Aggregated ClusterRoles silently expanding permissions

Kubernetes ships with aggregated ClusterRoles (admin, edit, view) that collect permissions from other roles via label selectors. If you create a ClusterRole with the right label, its permissions get merged in automatically:

yaml

1kind: ClusterRole
2metadata:
3  name: my-crd-permissions
4  labels:
5    rbac.authorization.k8s.io/aggregate-to-edit: "true"  # <-- merged into 'edit'
6rules:
7  - apiGroups: ["mycompany.io"]
8    resources: ["sensitiveresources"]
9    verbs: ["*"]

Everyone bound to the edit ClusterRole now has full access to sensitiveresources, and nobody added an explicit binding. This is a silent expansion — the edit ClusterRole itself didn't change, but its effective permissions did.

Fix: Audit every ClusterRole that has an aggregate-to-* label. Treat those labels as a trust boundary — they effectively make your role part of a shared, cluster-wide role.

bash

kubectl get clusterroles -o json \
  | jq '.items[] | select(.metadata.labels | 
      keys[] | startswith("rbac.authorization.k8s.io/aggregate-to")) 
    | .metadata.name'

5. `system:authenticated` bindings

yaml

subjects:
  - kind: Group
    name: system:authenticated
    apiGroup: rbac.authorization.k8s.io

This grants permissions to every authenticated user and service account in the cluster. I've found this in the wild more times than I'd like to admit — usually left by a Helm chart or copied from a StackOverflow answer. The person who wrote it meant "our team members," not "every workload running in the cluster."

system:unauthenticated is even worse and grants access to unauthenticated requests. Check for it:

bash

kubectl get clusterrolebindings -o json \
  | jq '.items[] | select(
      .subjects[]?.name == "system:authenticated" or
      .subjects[]?.name == "system:unauthenticated"
    ) | .metadata.name'

Fix: Replace with explicit user, group, or service account subjects. There is almost no legitimate reason to bind to system:authenticated cluster-wide.

6. Leaving debug bindings in place

This is the most common one. An incident happens at 2 AM. Someone adds a permissive ClusterRoleBinding to get debugging access fast. The incident is resolved. The binding is never removed.

yaml

1kind: ClusterRoleBinding
2metadata:
3  name: debug-access-temp  # "temp" — LOL
4roleRef:
5  kind: ClusterRole
6  name: cluster-admin
7subjects:
8  - kind: ServiceAccount
9    name: my-broken-app
10    namespace: production

Six months later, my-broken-app has been running with cluster-admin and nobody knew.

Fix — two layers:

First, clean up by auditing for high-privilege bindings that reference non-system subjects:

bash

kubectl get clusterrolebindings -o json \
  | jq '.items[] | select(.roleRef.name == "cluster-admin") |
      {name: .metadata.name, subjects: .subjects}'

Second, add expiry to any binding you create during an incident. Kubernetes doesn't support TTLs on RBAC objects natively, but you can enforce the practice via a custom admission webhook or just a documented runbook item: "every debug binding gets a GitHub issue tracking its removal."

7. Helm charts shipping over-privileged service accounts

Open any popular Helm chart and look at the templates/rbac.yaml. Many ship with ClusterRoles that include far more than the app actually needs — either because the developer wasn't sure what was needed, or because they wanted to avoid support tickets.

A common offender is the list/watch on all resources in all API groups:

yaml

rules:
  - apiGroups: ["*"]
    resources: ["*"]
    verbs: ["get", "list", "watch"]

Read-only across everything looks harmless. It isn't. An attacker who compromises that pod can enumerate your entire cluster topology — all services, all secrets metadata, all config maps, all running workloads.

Fix: Override RBAC rules at install time if the chart supports it. Many charts have a rbac.rules value for exactly this. If they don't, file an issue or fork the template:

yaml

# values.yaml override
rbac:
  rules:
    - apiGroups: [""]
      resources: ["configmaps"]
      verbs: ["get", "list"]

And always review rbac.yaml before helm install. Make it a mandatory step in your chart adoption checklist.

8. No audit log review

RBAC controls what's allowed. Audit logs tell you what's actually happening. Running without reviewing audit logs means you can have a correctly-configured RBAC policy and still miss that a service account is being used in unexpected ways — or that a compromised pod is probing the API for permissions.

Enable audit logging and set up alerts for:

Any request that results in 403 Forbidden from a service account (legitimate apps shouldn't be hitting permission boundaries constantly)
Any use of secrets list or watch verbs
Any request from a pod in a sensitive namespace calling the Kubernetes API at all (if that pod shouldn't be calling the API)

On EKS, audit logs land in CloudWatch. A basic CloudWatch Insights query:

fields @timestamp, user.username, objectRef.resource, verb, responseStatus.code
| filter responseStatus.code = 403
| filter ispresent(user.username)
| sort @timestamp desc
| limit 100

On self-managed clusters, configure the audit policy to capture at minimum RequestResponse level for secrets and Metadata level for everything else.

The audit you should run right now

bash

1# 1. Who has cluster-admin?
2kubectl get clusterrolebindings -o json \
3  | jq -r '.items[] | select(.roleRef.name=="cluster-admin") |
4    .metadata.name + " → " + ([.subjects[]? | .kind + "/" + .name] | join(", "))'
5
6# 2. Which service accounts can list/get secrets?
7kubectl auth can-i list secrets --as=system:serviceaccount:default:default
8kubectl auth can-i get secrets --as=system:serviceaccount:default:default
9
10# 3. Find wildcard rules anywhere in the cluster
11kubectl get clusterroles,roles -A -o json \
12  | jq '.items[] | select(.rules[]? | .verbs[] == "*") |
13    .metadata.name + " (" + (.metadata.namespace // "cluster") + ")"'
14
15# 4. Find bindings to system:authenticated
16kubectl get clusterrolebindings,rolebindings -A -o json \
17  | jq '.items[] | select(.subjects[]?.name == "system:authenticated") |
18    .metadata.name'

Run these on any cluster you're responsible for. The results are usually surprising.

The pattern behind all of them

Every misconfiguration above has the same root cause: RBAC was configured once and never revisited. Permissions accumulate. Bindings persist after their purpose is gone. Charts bring in roles nobody audited.

RBAC isn't a "set it and forget it" control. It needs to be reviewed the same way you'd review IAM policies in AWS — regularly, with tooling, and with a bias toward removing what isn't justified.

If you want to build these permissions interactively, I have an RBAC Generator that outputs correct Role + RoleBinding YAML, and a Kubernetes YAML Linter that catches common structural issues before you apply.

RBAC misconfigurations that break production (and how to fix them)

1. Wildcard verbs on sensitive resources

2. Using ClusterRoleBinding when you need RoleBinding

3. The default service account over-permission trap

4. Aggregated ClusterRoles silently expanding permissions

5. `system:authenticated` bindings

6. Leaving debug bindings in place

7. Helm charts shipping over-privileged service accounts

8. No audit log review

The audit you should run right now

The pattern behind all of them

Related Topics

Read Next

Kubernetes RBAC in Practice: Least Privilege Without the Headache

The Kubernetes Periodic Table: Every Essential Tool Category Explained

The Kubernetes Decision Path: A Practical Framework for Your Cloud-Native Stack

1. Wildcard verbs on sensitive resources

2. Using ClusterRoleBinding when you need RoleBinding

3. The default service account over-permission trap

4. Aggregated ClusterRoles silently expanding permissions

5. system:authenticated bindings

6. Leaving debug bindings in place

7. Helm charts shipping over-privileged service accounts

8. No audit log review

The audit you should run right now

The pattern behind all of them

Related Topics

Read Next

Kubernetes RBAC in Practice: Least Privilege Without the Headache

The Kubernetes Periodic Table: Every Essential Tool Category Explained

The Kubernetes Decision Path: A Practical Framework for Your Cloud-Native Stack

5. `system:authenticated` bindings