Kubernetes Resource Management: ResourceQuota, LimitRange, and QoS Classes (2026)

Every shared Kubernetes cluster eventually faces the same incident: a team deploys a workload without resource limits, it memory-leaks or gets traffic-spiked, the node fills up, kubelet starts evicting pods in priority order, and suddenly unrelated teams' workloads are going down. ResourceQuota and LimitRange are the controls that prevent this from being someone else's problem.

The pattern: LimitRange sets safe defaults (so forgetting to set requests/limits still results in something reasonable), ResourceQuota caps total namespace consumption (so one team can't take the whole cluster), and QoS classes determine eviction priority (Guaranteed pods survive longer).

QoS Classes

Kubernetes assigns a QoS class to every pod based on its resource configuration. This determines eviction order when a node is under memory pressure:

QoS Class	Condition	Eviction Priority
Guaranteed	All containers have `requests == limits` for CPU and memory	Last to be evicted
Burstable	At least one container has requests or limits set (but not Guaranteed)	Middle
BestEffort	No containers have any requests or limits	First to be evicted

yaml

1# Guaranteed — requests == limits for both CPU and memory
2spec:
3  containers:
4    - name: app
5      resources:
6        requests:
7          cpu: "500m"
8          memory: "256Mi"
9        limits:
10          cpu: "500m"      # Must equal request
11          memory: "256Mi"  # Must equal request
12
13# Burstable — can burst to limits, but can request less
14spec:
15  containers:
16    - name: app
17      resources:
18        requests:
19          cpu: "100m"
20          memory: "128Mi"
21        limits:
22          cpu: "1000m"    # Can burst 10x CPU
23          memory: "512Mi" # Can burst 4x memory
24
25# BestEffort — never set this intentionally for production workloads
26spec:
27  containers:
28    - name: app
29      # No resources block at all

Rule of thumb: Stateless web services → Burstable (allows traffic spikes). Databases, controllers, critical infrastructure → Guaranteed (predictable performance, last evicted). Background batch jobs → Burstable or BestEffort (low priority, acceptable to evict).

Dynamic Resizing: In-place Vertical Scaling (2026 Standard)

In 2026, we no longer need to restart pods to change their resource requests. In-place Vertical Scaling (native since Kubernetes 1.30) allows you to update container resources on the fly:

yaml

1spec:
2  containers:
3    - name: payments-api
4      resizePolicy:
5        - resourceName: cpu
6          restartPolicy: NotRequired    # Key: resize without restart
7        - resourceName: memory
8          restartPolicy: NotRequired

When you update the resources block of a running pod, the kubelet dynamically adjusts the container's cgroup limits without restarting the process. This is the new standard for stateful workloads that need to scale vertically without downtime.

LimitRange: Enforced Defaults

LimitRange sets default requests/limits for containers that don't specify them, and enforces min/max bounds:

yaml

1apiVersion: v1
2kind: LimitRange
3metadata:
4  name: default-limits
5  namespace: payments
6spec:
7  limits:
8    # Container defaults — applied when a container has no requests/limits
9    - type: Container
10      default:          # Default limits (what's used if no limits specified)
11        cpu: "500m"
12        memory: "256Mi"
13      defaultRequest:   # Default requests (what's used if no requests specified)
14        cpu: "100m"
15        memory: "128Mi"
16      max:              # Maximum allowed per container
17        cpu: "4000m"
18        memory: "4Gi"
19      min:              # Minimum allowed per container
20        cpu: "50m"
21        memory: "64Mi"
22
23    # Pod-level limit (sum of all containers in the pod)
24    - type: Pod
25      max:
26        cpu: "8000m"
27        memory: "8Gi"
28
29    # PVC size limits
30    - type: PersistentVolumeClaim
31      max:
32        storage: "100Gi"
33      min:
34        storage: "1Gi"

When a pod is created in this namespace without resource requests/limits, Kubernetes automatically applies defaultRequest for scheduling and default for limits. This converts BestEffort pods to Burstable, which is significantly safer for cluster stability.

ResourceQuota: Namespace Caps

ResourceQuota enforces total resource consumption for a namespace:

yaml

1apiVersion: v1
2kind: ResourceQuota
3metadata:
4  name: payments-quota
5  namespace: payments
6spec:
7  hard:
8    # Compute resources
9    requests.cpu: "8"        # Total CPU requested by all pods: 8 cores
10    limits.cpu: "16"         # Total CPU limit across all pods: 16 cores
11    requests.memory: "16Gi"  # Total memory requested
12    limits.memory: "32Gi"    # Total memory limit
13
14    # Object counts
15    pods: "50"               # Max 50 pods in namespace
16    services: "20"
17    persistentvolumeclaims: "20"
18    secrets: "50"
19    configmaps: "50"
20
21    # Storage
22    requests.storage: "500Gi"    # Total PVC storage requested
23    # gp3.storageclass.storage.k8s.io/requests.storage: "500Gi"  # Per-StorageClass limit

bash

1# Check quota usage
2kubectl describe resourcequota payments-quota -n payments
3
4# Name:            payments-quota
5# Namespace:       payments
6# Resource         Used    Hard
7# --------         ----    ----
8# limits.cpu       4       16
9# limits.memory    8Gi     32Gi
10# pods             12      50
11# requests.cpu     1200m   8
12# requests.memory  2Gi     16Gi

Priority Class Quotas

Different quotas for different priority levels — prevent low-priority jobs from consuming all capacity:

yaml

1# Create PriorityClasses
2# Note: preemptionPolicy defaults to PreemptLowerPriority — high-priority pods will
3# actively evict lower-priority pods to schedule themselves, not just survive node eviction.
4# Set preemptionPolicy: Never if you want priority-based eviction ordering only (no preemption).
5apiVersion: scheduling.k8s.io/v1
6kind: PriorityClass
7metadata:
8  name: high-priority
9value: 1000
10globalDefault: false
11preemptionPolicy: PreemptLowerPriority    # Default; explicitly set for clarity
12description: "Production critical services"
13
14---
15apiVersion: scheduling.k8s.io/v1
16kind: PriorityClass
17metadata:
18  name: low-priority
19value: 100
20globalDefault: false
21preemptionPolicy: Never    # Low-priority jobs don't preempt anything
22description: "Background batch jobs"
23
24---
25# Quota scoped to low-priority pods only
26apiVersion: v1
27kind: ResourceQuota
28metadata:
29  name: batch-quota
30  namespace: payments
31spec:
32  hard:
33    requests.cpu: "4"
34    requests.memory: "8Gi"
35    pods: "20"
36  scopeSelector:
37    matchExpressions:
38      - operator: In
39        scopeName: PriorityClass
40        values: ["low-priority"]    # Only applies to pods with this priority class

Namespace Initialization Template

For a platform team managing many teams' namespaces, create a standard template:

yaml

1# Kustomize base for a new team namespace
2# kustomization.yaml
3apiVersion: kustomize.config.k8s.io/v1beta1
4kind: Kustomization
5namespace: PLACEHOLDER    # Replaced by team overlay
6
7resources:
8  - namespace.yaml
9  - limitrange.yaml
10  - resourcequota.yaml
11  - networkpolicy-default-deny.yaml
12  - rbac.yaml
13
14---
15# namespace.yaml
16apiVersion: v1
17kind: Namespace
18metadata:
19  name: PLACEHOLDER
20  labels:
21    team: PLACEHOLDER
22    pod-security.kubernetes.io/enforce: restricted
23    pod-security.kubernetes.io/warn: restricted
24
25---
26# resourcequota.yaml — moderate team quota (adjust for team size)
27apiVersion: v1
28kind: ResourceQuota
29metadata:
30  name: team-quota
31spec:
32  hard:
33    requests.cpu: "8"
34    limits.cpu: "16"
35    requests.memory: "16Gi"
36    limits.memory: "32Gi"
37    pods: "50"
38    services: "20"
39    persistentvolumeclaims: "20"
40    requests.storage: "200Gi"
41
42---
43# limitrange.yaml
44apiVersion: v1
45kind: LimitRange
46metadata:
47  name: default-limits
48spec:
49  limits:
50    - type: Container
51      default:
52        cpu: "500m"
53        memory: "256Mi"
54      defaultRequest:
55        cpu: "100m"
56        memory: "128Mi"
57      max:
58        cpu: "4000m"
59        memory: "4Gi"

Admission Enforcement with Kyverno

LimitRange and ResourceQuota are reactive (they reject out-of-bounds pods). Kyverno can add proactive enforcement:

yaml

1# Require all containers to set resource requests and limits
2apiVersion: kyverno.io/v1
3kind: ClusterPolicy
4metadata:
5  name: require-resource-limits
6spec:
7  validationFailureAction: Enforce
8  background: false
9  rules:
10    - name: check-container-limits
11      match:
12        any:
13          - resources:
14              kinds: [Pod]
15      exclude:
16        any:
17          - resources:
18              kinds: [Pod]
19              namespaces: ["kube-system", "kube-public", "kube-node-lease"]
20      validate:
21        message: "Resource requests and limits are required for all containers."
22        pattern:
23          spec:
24            containers:
25              - resources:
26                  requests:
27                    cpu: "?*"
28                    memory: "?*"
29                  limits:
30                    cpu: "?*"
31                    memory: "?*"

Frequently Asked Questions

What happens when a namespace hits its ResourceQuota limit?

New pod creation fails — the Kubernetes API returns an error message pods "x" is forbidden: exceeded quota (note: this is a quota enforcement error, not a 403 authorization failure). Existing pods continue running — quota only gates new resource consumption. Teams see this as failed deployments. The quota-exceeded event is visible in namespace events: kubectl get events -n payments --field-selector reason=FailedCreate.

Should memory limits equal requests (Guaranteed) for everything?

No. The tradeoff: Guaranteed QoS gives predictable performance and last-evicted status, but you're reserving memory that the pod might not use. For APIs with variable load, Burstable is better — request what the pod needs at idle, allow limits for traffic spikes, and accept that it's evicted before Guaranteed workloads. Use Guaranteed for: databases (need predictable memory), controllers (cannot be evicted without cluster impact), and anything running on nodes where you're paying for dedicated capacity.

Can I increase a namespace's quota without a platform team review?

This is a policy decision, not a technical one — but the common pattern is: teams self-serve quota increases up to a pre-approved limit (e.g., double the base quota), and increases above that threshold require platform team review (capacity planning, cluster-wide impact). Backstage's Self-Service templates can automate the request workflow — see Platform Engineering: Building Golden Paths for Developer Self-Service.

For multi-tenancy patterns that build on ResourceQuota for full namespace isolation, see Kubernetes Multi-Tenancy Patterns. For VPA that provides right-sizing recommendations for setting accurate resource requests, see Kubernetes Cost Optimization and FinOps.

Struggling with resource contention or unexpected evictions on a shared cluster? Talk to us at Coding Protocols — we help platform teams design resource management policies that prevent noisy-neighbor problems without over-constraining development velocity.

Kubernetes Resource Management: Quotas, LimitRanges, and QoS Classes

QoS Classes

Dynamic Resizing: In-place Vertical Scaling (2026 Standard)

LimitRange: Enforced Defaults

ResourceQuota: Namespace Caps

Priority Class Quotas

Namespace Initialization Template

Admission Enforcement with Kyverno

Frequently Asked Questions

What happens when a namespace hits its ResourceQuota limit?

Should memory limits equal requests (Guaranteed) for everything?

Can I increase a namespace's quota without a platform team review?

Related Topics

Read Next

Kubernetes Resource Management: Requests, Limits, QoS, and LimitRanges

Helm Advanced Patterns: Chart Development and Production Operations

Kubernetes Scheduling: Taints, Tolerations, Affinity, and Priority Classes