Platform Engineering
12 min readMay 9, 2026

Kubernetes Resource Management: Quotas, LimitRanges, and QoS Classes

ResourceQuota and LimitRange are the controls that prevent one namespace from starving another on a shared cluster. Without them, a single misconfigured deployment can exhaust all cluster memory and cause cascading pod evictions across every team. The three pillars: requests/limits on every container (QoS), namespace-level quotas (total resource caps), and LimitRange defaults (enforced when teams forget to set requests).

CO
Coding Protocols Team
Platform Engineering
Kubernetes Resource Management: Quotas, LimitRanges, and QoS Classes

Every shared Kubernetes cluster eventually faces the same incident: a team deploys a workload without resource limits, it memory-leaks or gets traffic-spiked, the node fills up, kubelet starts evicting pods in priority order, and suddenly unrelated teams' workloads are going down. ResourceQuota and LimitRange are the controls that prevent this from being someone else's problem.

The pattern: LimitRange sets safe defaults (so forgetting to set requests/limits still results in something reasonable), ResourceQuota caps total namespace consumption (so one team can't take the whole cluster), and QoS classes determine eviction priority (Guaranteed pods survive longer).


QoS Classes

Kubernetes assigns a QoS class to every pod based on its resource configuration. This determines eviction order when a node is under memory pressure:

QoS ClassConditionEviction Priority
GuaranteedAll containers have requests == limits for CPU and memoryLast to be evicted
BurstableAt least one container has requests or limits set (but not Guaranteed)Middle
BestEffortNo containers have any requests or limitsFirst to be evicted
yaml
1# Guaranteed — requests == limits for both CPU and memory
2spec:
3  containers:
4    - name: app
5      resources:
6        requests:
7          cpu: "500m"
8          memory: "256Mi"
9        limits:
10          cpu: "500m"      # Must equal request
11          memory: "256Mi"  # Must equal request
12
13# Burstable — can burst to limits, but can request less
14spec:
15  containers:
16    - name: app
17      resources:
18        requests:
19          cpu: "100m"
20          memory: "128Mi"
21        limits:
22          cpu: "1000m"    # Can burst 10x CPU
23          memory: "512Mi" # Can burst 4x memory
24
25# BestEffort — never set this intentionally for production workloads
26spec:
27  containers:
28    - name: app
29      # No resources block at all

Rule of thumb: Stateless web services → Burstable (allows traffic spikes). Databases, controllers, critical infrastructure → Guaranteed (predictable performance, last evicted). Background batch jobs → Burstable or BestEffort (low priority, acceptable to evict).


Dynamic Resizing: In-place Vertical Scaling (2026 Standard)

In 2026, we no longer need to restart pods to change their resource requests. In-place Vertical Scaling (native since Kubernetes 1.30) allows you to update container resources on the fly:

yaml
1spec:
2  containers:
3    - name: payments-api
4      resizePolicy:
5        - resourceName: cpu
6          restartPolicy: NotRequired    # Key: resize without restart
7        - resourceName: memory
8          restartPolicy: NotRequired

When you update the resources block of a running pod, the kubelet dynamically adjusts the container's cgroup limits without restarting the process. This is the new standard for stateful workloads that need to scale vertically without downtime.


LimitRange: Enforced Defaults

LimitRange sets default requests/limits for containers that don't specify them, and enforces min/max bounds:

yaml
1apiVersion: v1
2kind: LimitRange
3metadata:
4  name: default-limits
5  namespace: payments
6spec:
7  limits:
8    # Container defaults — applied when a container has no requests/limits
9    - type: Container
10      default:          # Default limits (what's used if no limits specified)
11        cpu: "500m"
12        memory: "256Mi"
13      defaultRequest:   # Default requests (what's used if no requests specified)
14        cpu: "100m"
15        memory: "128Mi"
16      max:              # Maximum allowed per container
17        cpu: "4000m"
18        memory: "4Gi"
19      min:              # Minimum allowed per container
20        cpu: "50m"
21        memory: "64Mi"
22
23    # Pod-level limit (sum of all containers in the pod)
24    - type: Pod
25      max:
26        cpu: "8000m"
27        memory: "8Gi"
28
29    # PVC size limits
30    - type: PersistentVolumeClaim
31      max:
32        storage: "100Gi"
33      min:
34        storage: "1Gi"

When a pod is created in this namespace without resource requests/limits, Kubernetes automatically applies defaultRequest for scheduling and default for limits. This converts BestEffort pods to Burstable, which is significantly safer for cluster stability.


ResourceQuota: Namespace Caps

ResourceQuota enforces total resource consumption for a namespace:

yaml
1apiVersion: v1
2kind: ResourceQuota
3metadata:
4  name: payments-quota
5  namespace: payments
6spec:
7  hard:
8    # Compute resources
9    requests.cpu: "8"        # Total CPU requested by all pods: 8 cores
10    limits.cpu: "16"         # Total CPU limit across all pods: 16 cores
11    requests.memory: "16Gi"  # Total memory requested
12    limits.memory: "32Gi"    # Total memory limit
13
14    # Object counts
15    pods: "50"               # Max 50 pods in namespace
16    services: "20"
17    persistentvolumeclaims: "20"
18    secrets: "50"
19    configmaps: "50"
20
21    # Storage
22    requests.storage: "500Gi"    # Total PVC storage requested
23    # gp3.storageclass.storage.k8s.io/requests.storage: "500Gi"  # Per-StorageClass limit
bash
1# Check quota usage
2kubectl describe resourcequota payments-quota -n payments
3
4# Name:            payments-quota
5# Namespace:       payments
6# Resource         Used    Hard
7# --------         ----    ----
8# limits.cpu       4       16
9# limits.memory    8Gi     32Gi
10# pods             12      50
11# requests.cpu     1200m   8
12# requests.memory  2Gi     16Gi

Priority Class Quotas

Different quotas for different priority levels — prevent low-priority jobs from consuming all capacity:

yaml
1# Create PriorityClasses
2# Note: preemptionPolicy defaults to PreemptLowerPriority — high-priority pods will
3# actively evict lower-priority pods to schedule themselves, not just survive node eviction.
4# Set preemptionPolicy: Never if you want priority-based eviction ordering only (no preemption).
5apiVersion: scheduling.k8s.io/v1
6kind: PriorityClass
7metadata:
8  name: high-priority
9value: 1000
10globalDefault: false
11preemptionPolicy: PreemptLowerPriority    # Default; explicitly set for clarity
12description: "Production critical services"
13
14---
15apiVersion: scheduling.k8s.io/v1
16kind: PriorityClass
17metadata:
18  name: low-priority
19value: 100
20globalDefault: false
21preemptionPolicy: Never    # Low-priority jobs don't preempt anything
22description: "Background batch jobs"
23
24---
25# Quota scoped to low-priority pods only
26apiVersion: v1
27kind: ResourceQuota
28metadata:
29  name: batch-quota
30  namespace: payments
31spec:
32  hard:
33    requests.cpu: "4"
34    requests.memory: "8Gi"
35    pods: "20"
36  scopeSelector:
37    matchExpressions:
38      - operator: In
39        scopeName: PriorityClass
40        values: ["low-priority"]    # Only applies to pods with this priority class

Namespace Initialization Template

For a platform team managing many teams' namespaces, create a standard template:

yaml
1# Kustomize base for a new team namespace
2# kustomization.yaml
3apiVersion: kustomize.config.k8s.io/v1beta1
4kind: Kustomization
5namespace: PLACEHOLDER    # Replaced by team overlay
6
7resources:
8  - namespace.yaml
9  - limitrange.yaml
10  - resourcequota.yaml
11  - networkpolicy-default-deny.yaml
12  - rbac.yaml
13
14---
15# namespace.yaml
16apiVersion: v1
17kind: Namespace
18metadata:
19  name: PLACEHOLDER
20  labels:
21    team: PLACEHOLDER
22    pod-security.kubernetes.io/enforce: restricted
23    pod-security.kubernetes.io/warn: restricted
24
25---
26# resourcequota.yaml — moderate team quota (adjust for team size)
27apiVersion: v1
28kind: ResourceQuota
29metadata:
30  name: team-quota
31spec:
32  hard:
33    requests.cpu: "8"
34    limits.cpu: "16"
35    requests.memory: "16Gi"
36    limits.memory: "32Gi"
37    pods: "50"
38    services: "20"
39    persistentvolumeclaims: "20"
40    requests.storage: "200Gi"
41
42---
43# limitrange.yaml
44apiVersion: v1
45kind: LimitRange
46metadata:
47  name: default-limits
48spec:
49  limits:
50    - type: Container
51      default:
52        cpu: "500m"
53        memory: "256Mi"
54      defaultRequest:
55        cpu: "100m"
56        memory: "128Mi"
57      max:
58        cpu: "4000m"
59        memory: "4Gi"

Admission Enforcement with Kyverno

LimitRange and ResourceQuota are reactive (they reject out-of-bounds pods). Kyverno can add proactive enforcement:

yaml
1# Require all containers to set resource requests and limits
2apiVersion: kyverno.io/v1
3kind: ClusterPolicy
4metadata:
5  name: require-resource-limits
6spec:
7  validationFailureAction: Enforce
8  background: false
9  rules:
10    - name: check-container-limits
11      match:
12        any:
13          - resources:
14              kinds: [Pod]
15      exclude:
16        any:
17          - resources:
18              kinds: [Pod]
19              namespaces: ["kube-system", "kube-public", "kube-node-lease"]
20      validate:
21        message: "Resource requests and limits are required for all containers."
22        pattern:
23          spec:
24            containers:
25              - resources:
26                  requests:
27                    cpu: "?*"
28                    memory: "?*"
29                  limits:
30                    cpu: "?*"
31                    memory: "?*"

Frequently Asked Questions

What happens when a namespace hits its ResourceQuota limit?

New pod creation fails — the Kubernetes API returns an error message pods "x" is forbidden: exceeded quota (note: this is a quota enforcement error, not a 403 authorization failure). Existing pods continue running — quota only gates new resource consumption. Teams see this as failed deployments. The quota-exceeded event is visible in namespace events: kubectl get events -n payments --field-selector reason=FailedCreate.

Should memory limits equal requests (Guaranteed) for everything?

No. The tradeoff: Guaranteed QoS gives predictable performance and last-evicted status, but you're reserving memory that the pod might not use. For APIs with variable load, Burstable is better — request what the pod needs at idle, allow limits for traffic spikes, and accept that it's evicted before Guaranteed workloads. Use Guaranteed for: databases (need predictable memory), controllers (cannot be evicted without cluster impact), and anything running on nodes where you're paying for dedicated capacity.

Can I increase a namespace's quota without a platform team review?

This is a policy decision, not a technical one — but the common pattern is: teams self-serve quota increases up to a pre-approved limit (e.g., double the base quota), and increases above that threshold require platform team review (capacity planning, cluster-wide impact). Backstage's Self-Service templates can automate the request workflow — see Platform Engineering: Building Golden Paths for Developer Self-Service.


For multi-tenancy patterns that build on ResourceQuota for full namespace isolation, see Kubernetes Multi-Tenancy Patterns. For VPA that provides right-sizing recommendations for setting accurate resource requests, see Kubernetes Cost Optimization and FinOps.

Struggling with resource contention or unexpected evictions on a shared cluster? Talk to us at Coding Protocols — we help platform teams design resource management policies that prevent noisy-neighbor problems without over-constraining development velocity.

Related Topics

Kubernetes
Resource Management
ResourceQuota
LimitRange
QoS
Multi-Tenancy
Platform Engineering
EKS

Read Next