Kubernetes

Kubernetes Taints and Tolerations: Controlling Pod Placement

Intermediate25 min to complete11 min read

Taints repel pods from nodes. Tolerations allow specific pods to bypass those repulsions. Together they give you precise control over which workloads run on which nodes — GPU nodes, spot nodes, dedicated nodes, and nodes reserved for system components.

Before you begin

kubectl configured against a running cluster
Basic understanding of Kubernetes Pods, Nodes, and Deployments
Cluster-admin access to taint nodes

Kubernetes

Scheduling

Taints

Tolerations

Node Affinity

DevOps

By default, Kubernetes will schedule any pod on any node that has enough CPU and memory. Taints let you mark a node so that only pods with a specific toleration can land on it. This is how you enforce node isolation — GPU nodes for ML workloads only, spot nodes only for batch jobs, system nodes reserved for monitoring infrastructure.

The Core Model

A taint is attached to a node. It has three parts:

key=value:effect

key and value are arbitrary labels (value is optional)
effect controls what happens to pods that don't tolerate the taint:
- NoSchedule — new pods without the toleration won't be scheduled here
- PreferNoSchedule — the scheduler tries to avoid this node but may still use it
- NoExecute — new pods won't be scheduled AND existing pods without the toleration are evicted

A toleration on a pod says: "I can tolerate the taint with this key/value/effect."

Adding a Taint to a Node

bash

1# Taint a node
2kubectl taint nodes node1 dedicated=gpu:NoSchedule
3
4# Verify
5kubectl describe node node1 | grep Taint
6# Taints: dedicated=gpu:NoSchedule
7
8# Remove a taint (note the trailing dash)
9kubectl taint nodes node1 dedicated=gpu:NoSchedule-

To taint all nodes in a node group (useful when adding a new node group):

bash

1# List nodes by label (e.g., node group label on EKS)
2kubectl get nodes -l eks.amazonaws.com/nodegroup=gpu-nodes
3
4# Taint each one, or use a label selector via a loop
5for node in $(kubectl get nodes -l eks.amazonaws.com/nodegroup=gpu-nodes -o name); do
6  kubectl taint $node dedicated=gpu:NoSchedule
7done

On managed node groups (EKS, GKE, AKS), prefer setting taints in the node group configuration so new nodes in the group are tainted automatically. Managing taints manually means new nodes start untainted until you notice.

Adding a Toleration to a Pod

yaml

1apiVersion: v1
2kind: Pod
3metadata:
4  name: gpu-job
5spec:
6  tolerations:
7    - key: "dedicated"
8      operator: "Equal"
9      value: "gpu"
10      effect: "NoSchedule"
11  containers:
12    - name: trainer
13      image: pytorch/pytorch:2.2.0-cuda12.1-cudnn8-runtime
14      resources:
15        limits:
16          nvidia.com/gpu: 1

The toleration says: "I can tolerate the dedicated=gpu:NoSchedule taint." This pod will now be scheduled on the GPU node.

Toleration Operators

Operator	Behaviour
`Equal`	Key and value must match exactly
`Exists`	Key must exist, value is ignored

yaml

1# Exists — tolerate any taint with key "dedicated", any value, NoSchedule effect
2tolerations:
3  - key: "dedicated"
4    operator: "Exists"
5    effect: "NoSchedule"
6
7# Wildcard — tolerate everything (effect omitted means all effects)
8tolerations:
9  - operator: "Exists"

The wildcard toleration is what system pods like kube-proxy and the CNI DaemonSet use — they need to run on every node regardless of taints.

The Three Effects in Practice

NoSchedule — Node Isolation

Use when you want hard isolation: only pods that explicitly tolerate the taint can land here.

bash

kubectl taint nodes spot-node-1 spot=true:NoSchedule

Pods without a spot=true:NoSchedule toleration will never be scheduled on spot-node-1. Existing pods already running there are not affected.

PreferNoSchedule — Soft Preference

Use when you'd prefer pods avoid a node but it's not a hard requirement. The scheduler will try to find other nodes first.

bash

kubectl taint nodes node1 experimental=true:PreferNoSchedule

Useful for gradually migrating workloads off a node, or for nodes with degraded hardware that still function.

NoExecute — Eviction

Use when you want to evict existing pods in addition to blocking new ones. This is also what Kubernetes uses internally for node conditions.

bash

kubectl taint nodes node1 maintenance=true:NoExecute

All pods without a matching toleration are evicted immediately. Pods with the matching toleration stay, but you can control how long:

yaml

tolerations:
  - key: "maintenance"
    operator: "Equal"
    value: "true"
    effect: "NoExecute"
    tolerationSeconds: 300   # pod is evicted after 5 minutes

Built-in NoExecute Taints

Kubernetes automatically applies NoExecute taints to nodes in bad states:

Taint	Condition
`node.kubernetes.io/not-ready`	Node not ready
`node.kubernetes.io/unreachable`	Node unreachable
`node.kubernetes.io/memory-pressure`	Node under memory pressure
`node.kubernetes.io/disk-pressure`	Node under disk pressure
`node.kubernetes.io/network-unavailable`	Node network unavailable

By default, Kubernetes adds tolerations for not-ready:NoExecute and unreachable:NoExecute with tolerationSeconds: 300 to every pod. This gives a 5-minute window before a pod is evicted from a failed node. You can override this:

yaml

tolerations:
  - key: "node.kubernetes.io/not-ready"
    operator: "Exists"
    effect: "NoExecute"
    tolerationSeconds: 60   # evict faster for latency-sensitive services

Common Patterns

Dedicated GPU Nodes

bash

# Taint the GPU node group
kubectl taint nodes -l node.kubernetes.io/instance-type=p3.2xlarge gpu=true:NoSchedule

yaml

1# Only GPU workloads tolerate the taint
2spec:
3  tolerations:
4    - key: "gpu"
5      operator: "Equal"
6      value: "true"
7      effect: "NoSchedule"
8  resources:
9    limits:
10      nvidia.com/gpu: 1

Non-GPU pods that land on a GPU node waste expensive GPU capacity. This taint prevents that.

Spot Instance Isolation

Run batch jobs and non-critical workloads on spot/preemptible nodes while keeping critical services on on-demand nodes.

bash

kubectl taint nodes -l node-type=spot spot=true:NoSchedule

yaml

1# Batch job — tolerates spot
2spec:
3  tolerations:
4    - key: "spot"
5      operator: "Equal"
6      value: "true"
7      effect: "NoSchedule"

Critical services have no spot toleration — they'll never be scheduled on spot nodes.

System/Infrastructure Nodes

Reserve specific nodes for monitoring, ingress, or other platform infrastructure:

bash

kubectl taint nodes infra-node-1 role=infra:NoSchedule

yaml

1# Prometheus, Grafana, ingress controllers — add the toleration + nodeSelector
2spec:
3  tolerations:
4    - key: "role"
5      operator: "Equal"
6      value: "infra"
7      effect: "NoSchedule"
8  nodeSelector:
9    role: infra

Toleration alone isn't enough — it allows a pod to schedule on the tainted node but doesn't force it there. Pair with nodeSelector or nodeAffinity to direct the pod specifically to the infra node.

Taints vs Node Affinity

Feature	Taints + Tolerations	Node Affinity
Direction	Node repels pods	Pod attracts to nodes
Hard/soft	Hard (NoSchedule) or eviction (NoExecute)	`required` (hard) or `preferred` (soft)
Scope	Any pod without toleration is blocked	Only affects pods with affinity rules
Best for	Dedicated nodes, hardware isolation	Topology constraints, zone spread

Use taints when you want to protect a node from unwanted tenants. Use node affinity when you want to express where a pod should go. In practice, you often use both together — taint the dedicated node AND add node affinity to direct the right pods there.

Verifying Pod Placement

bash

1# Check which node a pod landed on
2kubectl get pod gpu-job -o wide
3
4# Check all pods on a specific node
5kubectl get pods --all-namespaces --field-selector spec.nodeName=node1
6
7# Check why a pod is pending (often taint-related)
8kubectl describe pod gpu-job
9# Look for "Events" section — taint mismatches show up as:
10# 0/3 nodes are available: 3 node(s) had untolerated taint {dedicated: gpu}

The describe output is the fastest way to diagnose scheduling failures. The message "N node(s) had untolerated taint" tells you exactly which taint is blocking the pod.

We built Podscape to simplify Kubernetes workflows like this — logs, events, and cluster state in one interface, without switching tools.

Struggling with this in production?

We help teams fix these exact issues. Our engineers have deployed these patterns across production environments at scale.

Get Expert Help View Services

Continue learning

IntermediateNode Affinity, Taints & Tolerations in ProductionPin workloads to the right nodes and keep undesirable pods away. This tutorial covers node labels, requiredDuring vs preferredDuring affinity, taint effects (NoSchedule, NoExecute, PreferNoSchedule), and how to combine both for GPU node pools.Start AdvancedPod Topology Spread Constraints for High AvailabilityUse pod topology spread constraints to distribute workloads evenly across availability zones and nodes. Covers maxSkew, whenUnsatisfiable, topologyKey, and how to combine spread constraints with node affinity for zone-aware HA deployments.Start BeginnerKubernetes Persistent Volumes and PVCs ExplainedPods are ephemeral — their local storage disappears when they restart. Persistent Volumes decouple storage from pod lifecycle. This tutorial covers PV, PVC, and StorageClass from scratch, with real examples and the common mistakes that cause PVCs to stay stuck in Pending.Start

Go deeper

Kubernetes Scheduling: Taints, Tolerations, Affinity, and Priority ClassesKubernetes scheduling determines which node runs each pod. The default scheduler considers resource requests, node conditions, and spreading requirements — but the tools that shape where workloads land are taints and tolerations, node affinity, pod affinity/anti-affinity, topology spread constraints, and PriorityClasses. This covers the production scheduling patterns: GPU node isolation with taints, database anti-affinity across AZs, topology spread for even distribution, and PriorityClasses that protect critical workloads from eviction.

14 min

Kubernetes PodSecurityContext vs SecurityContext: Which One AppliesBoth PodSecurityContext and SecurityContext control Linux security settings in Kubernetes — but they apply at different scopes. Get the scope wrong and your security settings either silently don't apply or get overridden by something you didn't expect.

7 min

FluxCD in Production: GitOps for Kubernetes Without the ArgoCD UIFlux is a pull-based GitOps operator that reconciles your cluster state to Git continuously — no UI, CLI-first, and built around composable controllers. Here's how Kustomization, HelmRelease, and ImageAutomation work together in a real platform setup.

14 min