Kubernetes Resource Management: Quotas, LimitRanges, and QoS Classes
ResourceQuota and LimitRange are the controls that prevent one namespace from starving another on a shared cluster. Without them, a single misconfigured deployment can exhaust all cluster memory and cause cascading pod evictions across every team. The three pillars: requests/limits on every container (QoS), namespace-level quotas (total resource caps), and LimitRange defaults (enforced when teams forget to set requests).

Every shared Kubernetes cluster eventually faces the same incident: a team deploys a workload without resource limits, it memory-leaks or gets traffic-spiked, the node fills up, kubelet starts evicting pods in priority order, and suddenly unrelated teams' workloads are going down. ResourceQuota and LimitRange are the controls that prevent this from being someone else's problem.
The pattern: LimitRange sets safe defaults (so forgetting to set requests/limits still results in something reasonable), ResourceQuota caps total namespace consumption (so one team can't take the whole cluster), and QoS classes determine eviction priority (Guaranteed pods survive longer).
QoS Classes
Kubernetes assigns a QoS class to every pod based on its resource configuration. This determines eviction order when a node is under memory pressure:
| QoS Class | Condition | Eviction Priority |
|---|---|---|
| Guaranteed | All containers have requests == limits for CPU and memory | Last to be evicted |
| Burstable | At least one container has requests or limits set (but not Guaranteed) | Middle |
| BestEffort | No containers have any requests or limits | First to be evicted |
1# Guaranteed — requests == limits for both CPU and memory
2spec:
3 containers:
4 - name: app
5 resources:
6 requests:
7 cpu: "500m"
8 memory: "256Mi"
9 limits:
10 cpu: "500m" # Must equal request
11 memory: "256Mi" # Must equal request
12
13# Burstable — can burst to limits, but can request less
14spec:
15 containers:
16 - name: app
17 resources:
18 requests:
19 cpu: "100m"
20 memory: "128Mi"
21 limits:
22 cpu: "1000m" # Can burst 10x CPU
23 memory: "512Mi" # Can burst 4x memory
24
25# BestEffort — never set this intentionally for production workloads
26spec:
27 containers:
28 - name: app
29 # No resources block at allRule of thumb: Stateless web services → Burstable (allows traffic spikes). Databases, controllers, critical infrastructure → Guaranteed (predictable performance, last evicted). Background batch jobs → Burstable or BestEffort (low priority, acceptable to evict).
Dynamic Resizing: In-place Vertical Scaling (2026 Standard)
In 2026, we no longer need to restart pods to change their resource requests. In-place Vertical Scaling (native since Kubernetes 1.30) allows you to update container resources on the fly:
1spec:
2 containers:
3 - name: payments-api
4 resizePolicy:
5 - resourceName: cpu
6 restartPolicy: NotRequired # Key: resize without restart
7 - resourceName: memory
8 restartPolicy: NotRequiredWhen you update the resources block of a running pod, the kubelet dynamically adjusts the container's cgroup limits without restarting the process. This is the new standard for stateful workloads that need to scale vertically without downtime.
LimitRange: Enforced Defaults
LimitRange sets default requests/limits for containers that don't specify them, and enforces min/max bounds:
1apiVersion: v1
2kind: LimitRange
3metadata:
4 name: default-limits
5 namespace: payments
6spec:
7 limits:
8 # Container defaults — applied when a container has no requests/limits
9 - type: Container
10 default: # Default limits (what's used if no limits specified)
11 cpu: "500m"
12 memory: "256Mi"
13 defaultRequest: # Default requests (what's used if no requests specified)
14 cpu: "100m"
15 memory: "128Mi"
16 max: # Maximum allowed per container
17 cpu: "4000m"
18 memory: "4Gi"
19 min: # Minimum allowed per container
20 cpu: "50m"
21 memory: "64Mi"
22
23 # Pod-level limit (sum of all containers in the pod)
24 - type: Pod
25 max:
26 cpu: "8000m"
27 memory: "8Gi"
28
29 # PVC size limits
30 - type: PersistentVolumeClaim
31 max:
32 storage: "100Gi"
33 min:
34 storage: "1Gi"When a pod is created in this namespace without resource requests/limits, Kubernetes automatically applies defaultRequest for scheduling and default for limits. This converts BestEffort pods to Burstable, which is significantly safer for cluster stability.
ResourceQuota: Namespace Caps
ResourceQuota enforces total resource consumption for a namespace:
1apiVersion: v1
2kind: ResourceQuota
3metadata:
4 name: payments-quota
5 namespace: payments
6spec:
7 hard:
8 # Compute resources
9 requests.cpu: "8" # Total CPU requested by all pods: 8 cores
10 limits.cpu: "16" # Total CPU limit across all pods: 16 cores
11 requests.memory: "16Gi" # Total memory requested
12 limits.memory: "32Gi" # Total memory limit
13
14 # Object counts
15 pods: "50" # Max 50 pods in namespace
16 services: "20"
17 persistentvolumeclaims: "20"
18 secrets: "50"
19 configmaps: "50"
20
21 # Storage
22 requests.storage: "500Gi" # Total PVC storage requested
23 # gp3.storageclass.storage.k8s.io/requests.storage: "500Gi" # Per-StorageClass limit1# Check quota usage
2kubectl describe resourcequota payments-quota -n payments
3
4# Name: payments-quota
5# Namespace: payments
6# Resource Used Hard
7# -------- ---- ----
8# limits.cpu 4 16
9# limits.memory 8Gi 32Gi
10# pods 12 50
11# requests.cpu 1200m 8
12# requests.memory 2Gi 16GiPriority Class Quotas
Different quotas for different priority levels — prevent low-priority jobs from consuming all capacity:
1# Create PriorityClasses
2# Note: preemptionPolicy defaults to PreemptLowerPriority — high-priority pods will
3# actively evict lower-priority pods to schedule themselves, not just survive node eviction.
4# Set preemptionPolicy: Never if you want priority-based eviction ordering only (no preemption).
5apiVersion: scheduling.k8s.io/v1
6kind: PriorityClass
7metadata:
8 name: high-priority
9value: 1000
10globalDefault: false
11preemptionPolicy: PreemptLowerPriority # Default; explicitly set for clarity
12description: "Production critical services"
13
14---
15apiVersion: scheduling.k8s.io/v1
16kind: PriorityClass
17metadata:
18 name: low-priority
19value: 100
20globalDefault: false
21preemptionPolicy: Never # Low-priority jobs don't preempt anything
22description: "Background batch jobs"
23
24---
25# Quota scoped to low-priority pods only
26apiVersion: v1
27kind: ResourceQuota
28metadata:
29 name: batch-quota
30 namespace: payments
31spec:
32 hard:
33 requests.cpu: "4"
34 requests.memory: "8Gi"
35 pods: "20"
36 scopeSelector:
37 matchExpressions:
38 - operator: In
39 scopeName: PriorityClass
40 values: ["low-priority"] # Only applies to pods with this priority classNamespace Initialization Template
For a platform team managing many teams' namespaces, create a standard template:
1# Kustomize base for a new team namespace
2# kustomization.yaml
3apiVersion: kustomize.config.k8s.io/v1beta1
4kind: Kustomization
5namespace: PLACEHOLDER # Replaced by team overlay
6
7resources:
8 - namespace.yaml
9 - limitrange.yaml
10 - resourcequota.yaml
11 - networkpolicy-default-deny.yaml
12 - rbac.yaml
13
14---
15# namespace.yaml
16apiVersion: v1
17kind: Namespace
18metadata:
19 name: PLACEHOLDER
20 labels:
21 team: PLACEHOLDER
22 pod-security.kubernetes.io/enforce: restricted
23 pod-security.kubernetes.io/warn: restricted
24
25---
26# resourcequota.yaml — moderate team quota (adjust for team size)
27apiVersion: v1
28kind: ResourceQuota
29metadata:
30 name: team-quota
31spec:
32 hard:
33 requests.cpu: "8"
34 limits.cpu: "16"
35 requests.memory: "16Gi"
36 limits.memory: "32Gi"
37 pods: "50"
38 services: "20"
39 persistentvolumeclaims: "20"
40 requests.storage: "200Gi"
41
42---
43# limitrange.yaml
44apiVersion: v1
45kind: LimitRange
46metadata:
47 name: default-limits
48spec:
49 limits:
50 - type: Container
51 default:
52 cpu: "500m"
53 memory: "256Mi"
54 defaultRequest:
55 cpu: "100m"
56 memory: "128Mi"
57 max:
58 cpu: "4000m"
59 memory: "4Gi"Admission Enforcement with Kyverno
LimitRange and ResourceQuota are reactive (they reject out-of-bounds pods). Kyverno can add proactive enforcement:
1# Require all containers to set resource requests and limits
2apiVersion: kyverno.io/v1
3kind: ClusterPolicy
4metadata:
5 name: require-resource-limits
6spec:
7 validationFailureAction: Enforce
8 background: false
9 rules:
10 - name: check-container-limits
11 match:
12 any:
13 - resources:
14 kinds: [Pod]
15 exclude:
16 any:
17 - resources:
18 kinds: [Pod]
19 namespaces: ["kube-system", "kube-public", "kube-node-lease"]
20 validate:
21 message: "Resource requests and limits are required for all containers."
22 pattern:
23 spec:
24 containers:
25 - resources:
26 requests:
27 cpu: "?*"
28 memory: "?*"
29 limits:
30 cpu: "?*"
31 memory: "?*"Frequently Asked Questions
What happens when a namespace hits its ResourceQuota limit?
New pod creation fails — the Kubernetes API returns an error message pods "x" is forbidden: exceeded quota (note: this is a quota enforcement error, not a 403 authorization failure). Existing pods continue running — quota only gates new resource consumption. Teams see this as failed deployments. The quota-exceeded event is visible in namespace events: kubectl get events -n payments --field-selector reason=FailedCreate.
Should memory limits equal requests (Guaranteed) for everything?
No. The tradeoff: Guaranteed QoS gives predictable performance and last-evicted status, but you're reserving memory that the pod might not use. For APIs with variable load, Burstable is better — request what the pod needs at idle, allow limits for traffic spikes, and accept that it's evicted before Guaranteed workloads. Use Guaranteed for: databases (need predictable memory), controllers (cannot be evicted without cluster impact), and anything running on nodes where you're paying for dedicated capacity.
Can I increase a namespace's quota without a platform team review?
This is a policy decision, not a technical one — but the common pattern is: teams self-serve quota increases up to a pre-approved limit (e.g., double the base quota), and increases above that threshold require platform team review (capacity planning, cluster-wide impact). Backstage's Self-Service templates can automate the request workflow — see Platform Engineering: Building Golden Paths for Developer Self-Service.
For multi-tenancy patterns that build on ResourceQuota for full namespace isolation, see Kubernetes Multi-Tenancy Patterns. For VPA that provides right-sizing recommendations for setting accurate resource requests, see Kubernetes Cost Optimization and FinOps.
Struggling with resource contention or unexpected evictions on a shared cluster? Talk to us at Coding Protocols — we help platform teams design resource management policies that prevent noisy-neighbor problems without over-constraining development velocity.


