Kubernetes
14 min readMay 6, 2026

How to Install Karpenter on EKS: A Production-Ready Setup Guide

A step-by-step guide to installing Karpenter v1.x on EKS using the current API — NodePool, EC2NodeClass, and Helm. Includes IAM setup, common failure modes, and what to verify before sending production traffic through it.

CO
Coding Protocols Team
Platform Engineering
How to Install Karpenter on EKS: A Production-Ready Setup Guide

Karpenter replaces the Cluster Autoscaler for most EKS workloads. It provisions nodes faster, bins-packs more efficiently, and supports spot interruption handling out of the box. If you're still running Cluster Autoscaler on EKS, it's worth the migration.

This guide covers a production-ready Karpenter v1.x installation. If you've seen older tutorials using Provisioner and AWSNodeTemplate, those are the v0.x APIs — both were removed in v1.0. The current API uses NodePool and EC2NodeClass.


Prerequisites

  • EKS cluster running Kubernetes 1.27+ (Karpenter v1.x requires 1.27 minimum)
  • eksctl, kubectl, helm, and aws CLI installed and configured
  • Cluster using VPC CNI (the default for EKS)
  • Your AWS account ID and cluster name ready — you'll use these throughout

Set variables up front:

bash
export CLUSTER_NAME="my-cluster"
export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
export AWS_REGION="us-east-1"
export KARPENTER_VERSION="1.3.3"  # check https://github.com/aws/karpenter/releases for latest
export KARPENTER_NAMESPACE="kube-system"

Step 1: Create the Karpenter IAM Role

Karpenter's controller needs IAM permissions to call EC2 APIs: describe instance types, launch instances, terminate instances, and tag resources. It also needs to pass an IAM role to the instances it launches.

Create the IAM role. In 2026, EKS Pod Identity is the preferred method for granting IAM permissions to pods. It is simpler than IRSA as it doesn't require OIDC provider management or service account annotations.

  1. Create the IAM role with a trust policy allowing pods.eks.amazonaws.com to assume it.
  2. Associate the role with the Karpenter service account:
bash
aws eks create-pod-identity-association \
  --cluster-name "${CLUSTER_NAME}" \
  --namespace "${KARPENTER_NAMESPACE}" \
  --service-account karpenter \
  --role-arn "arn:aws:iam::${AWS_ACCOUNT_ID}:role/KarpenterControllerRole-${CLUSTER_NAME}"

Option B: IRSA (Legacy)

If your cluster is running an older version or you prefer OIDC:

bash
1eksctl create iamserviceaccount \
2  --cluster "${CLUSTER_NAME}" \
3  --namespace "${KARPENTER_NAMESPACE}" \
4  --name karpenter \
5  --role-name "KarpenterControllerRole-${CLUSTER_NAME}" \
6  --attach-policy-arn "arn:aws:iam::${AWS_ACCOUNT_ID}:policy/KarpenterControllerPolicy-${CLUSTER_NAME}" \
7  --approve

But first you need the policy. The official Karpenter CloudFormation stack handles this. Pull it:

bash
aws cloudformation deploy \
  --stack-name "Karpenter-${CLUSTER_NAME}" \
  --template-file "https://raw.githubusercontent.com/aws/karpenter-provider-aws/v${KARPENTER_VERSION}/website/content/en/preview/getting-started/getting-started-with-karpenter/cloudformation.yaml" \
  --capabilities CAPABILITY_NAMED_IAM \
  --parameter-overrides "ClusterName=${CLUSTER_NAME}"

This creates:

  • KarpenterControllerPolicy-<cluster> — EC2 permissions for the controller
  • KarpenterNodeRole-<cluster> — the IAM role attached to nodes Karpenter launches
  • KarpenterNodeInstanceProfile-<cluster> — the instance profile wrapping the node role

Important: the KarpenterNodeRole needs to be mapped in your aws-auth ConfigMap so that nodes Karpenter launches can join the cluster:

bash
eksctl create iamidentitymapping \
  --cluster "${CLUSTER_NAME}" \
  --region "${AWS_REGION}" \
  --arn "arn:aws:iam::${AWS_ACCOUNT_ID}:role/KarpenterNodeRole-${CLUSTER_NAME}" \
  --username system:node:{{EC2PrivateDNSName}} \
  --group system:bootstrappers,system:nodes

Step 2: Tag Your Subnets and Security Groups

Karpenter discovers which subnets and security groups to use for new nodes via EC2 resource tags. Without these tags, Karpenter can't launch nodes.

Tag your node subnets:

bash
1for SUBNET_ID in $(aws eks describe-cluster \
2  --name "${CLUSTER_NAME}" \
3  --query "cluster.resourcesVpcConfig.subnetIds[]" \
4  --output text); do
5  aws ec2 create-tags \
6    --resources "${SUBNET_ID}" \
7    --tags "Key=karpenter.sh/discovery,Value=${CLUSTER_NAME}"
8done

Tag the cluster security group:

bash
1CLUSTER_SG=$(aws eks describe-cluster \
2  --name "${CLUSTER_NAME}" \
3  --query "cluster.resourcesVpcConfig.clusterSecurityGroupId" \
4  --output text)
5
6aws ec2 create-tags \
7  --resources "${CLUSTER_SG}" \
8  --tags "Key=karpenter.sh/discovery,Value=${CLUSTER_NAME}"

These tags are what the EC2NodeClass resource references when selecting subnets and security groups.


Step 3: Install Karpenter with Helm

Add the Karpenter Helm repo:

bash
1helm registry logout public.ecr.aws 2>/dev/null || true
2
3helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter \
4  --version "${KARPENTER_VERSION}" \
5  --namespace "${KARPENTER_NAMESPACE}" \
6  --create-namespace \
7  --set "settings.clusterName=${CLUSTER_NAME}" \
8  --set "settings.interruptionQueue=${CLUSTER_NAME}" \
9  --set controller.resources.requests.cpu=1 \
10  --set controller.resources.requests.memory=1Gi \
11  --set controller.resources.limits.cpu=1 \
12  --set controller.resources.limits.memory=1Gi \
13  --wait

Key flags:

  • settings.clusterName — tells Karpenter which cluster it's managing
  • settings.interruptionQueue — the SQS queue name for spot interruption events (created by the CloudFormation stack)
  • Resource requests/limits — Karpenter runs as a deployment; right-size it to avoid OOM on large clusters

Verify the controller is running:

bash
kubectl get pods -n "${KARPENTER_NAMESPACE}" -l app.kubernetes.io/name=karpenter

Expected output: two pods (controller and webhook) in Running state.


Step 4: Create an EC2NodeClass

EC2NodeClass defines the AWS-specific configuration for the nodes Karpenter provisions: AMI family, subnets, security groups, IAM instance profile, and block device mappings.

yaml
1apiVersion: karpenter.k8s.aws/v1
2kind: EC2NodeClass
3metadata:
4  name: default
5spec:
6  amiFamily: AL2023
7  role: "KarpenterNodeRole-${CLUSTER_NAME}"
8  subnetSelectorTerms:
9    - tags:
10        karpenter.sh/discovery: "${CLUSTER_NAME}"
11  securityGroupSelectorTerms:
12    - tags:
13        karpenter.sh/discovery: "${CLUSTER_NAME}"
14  amiSelectorTerms:
15    - alias: al2023@latest
16  blockDeviceMappings:
17    - deviceName: /dev/xvda
18      ebs:
19        volumeSize: 50Gi
20        volumeType: gp3
21        encrypted: true

Apply it:

bash
envsubst < ec2nodeclass.yaml | kubectl apply -f -

Notes:

  • AL2023 is Amazon Linux 2023 — the recommended AMI family for new EKS clusters. AL2 (Amazon Linux 2) is still supported but entering maintenance.
  • amiSelectorTerms with alias: al2023@latest automatically tracks the latest EKS-optimised AL2023 AMI for your cluster version. This means node refreshes pick up security patches automatically.
  • Always set encrypted: true on block devices. The default is unencrypted.

Step 5: Create a NodePool

NodePool defines the scheduling constraints for nodes Karpenter provisions: instance types, zones, capacity type (on-demand vs spot), taints, labels, and expiry.

yaml
1apiVersion: karpenter.sh/v1
2kind: NodePool
3metadata:
4  name: default
5spec:
6  template:
7    metadata:
8      labels:
9        billing-team: platform
10    spec:
11      nodeClassRef:
12        group: karpenter.k8s.aws
13        kind: EC2NodeClass
14        name: default
15      requirements:
16        - key: kubernetes.io/arch
17          operator: In
18          values: ["amd64"]
19        - key: kubernetes.io/os
20          operator: In
21          values: ["linux"]
22        - key: karpenter.sh/capacity-type
23          operator: In
24          values: ["spot", "on-demand"]
25        - key: karpenter.k8s.aws/instance-category
26          operator: In
27          values: ["c", "m", "r"]
28        - key: karpenter.k8s.aws/instance-generation
29          operator: Gt
30          values: ["2"]
31      expireAfter: 720h  # 30 days — force node rotation for security patches
32  limits:
33    cpu: 1000
34    memory: 1000Gi
35  disruption:
36    consolidationPolicy: WhenEmptyOrUnderutilized
37    consolidateAfter: 1m

Apply it:

bash
kubectl apply -f nodepool.yaml

Key decisions in this config:

Capacity type: ["spot", "on-demand"] allows Karpenter to use spot when available and fall back to on-demand. Spot can be 70–90% cheaper. If you need guaranteed capacity for a workload, add a separate NodePool with values: ["on-demand"] and use node selectors to route critical pods there.

Instance categories: c (compute), m (general purpose), r (memory) covers the common instance families. Restricting to generation > 2 avoids older, less cost-effective instance types.

expireAfter: Node expiry forces regular rotation, ensuring nodes pick up the latest AMI (with OS security patches). 30 days is a reasonable default for most environments.

consolidationPolicy: WhenEmptyOrUnderutilized consolidates nodes that are empty or where pods can be rescheduled more efficiently. This is the most aggressive consolidation setting — it reduces cost but increases pod disruption. Use WhenEmpty if your workloads are sensitive to disruption.


Step 6: Verify Karpenter Is Working

Deploy a test workload that requests more capacity than your existing nodes have:

bash
1kubectl apply -f - <<EOF
2apiVersion: apps/v1
3kind: Deployment
4metadata:
5  name: karpenter-test
6spec:
7  replicas: 5
8  selector:
9    matchLabels:
10      app: karpenter-test
11  template:
12    metadata:
13      labels:
14        app: karpenter-test
15    spec:
16      containers:
17        - name: pause
18          image: public.ecr.aws/eks-distro/kubernetes/pause:3.7
19          resources:
20            requests:
21              cpu: 500m
22              memory: 512Mi
23EOF

Watch for node provisioning:

bash
kubectl get nodes -w

Within 30–60 seconds you should see a new node in NotReady, transitioning to Ready. Check Karpenter's logs if it doesn't:

bash
kubectl logs -n "${KARPENTER_NAMESPACE}" -l app.kubernetes.io/name=karpenter -c controller --tail=50

Clean up after verifying:

bash
kubectl delete deployment karpenter-test

Common Failure Modes

Nodes launch but don't join the cluster. Almost always an aws-auth ConfigMap issue — the KarpenterNodeRole isn't mapped. Verify with kubectl describe configmap aws-auth -n kube-system.

"No instance type satisfies requirements." Your NodePool requirements are too restrictive. Check that your subnets have available IPs in the requested AZs, that the instance categories you specified are available in your region, and that your limits haven't been reached.

IAM deadlock on Terraform apply. If you manage the Karpenter IAM role with Terraform and the controller is running when you apply, you can hit a race condition where the controller's in-flight EC2 calls hold a lock on the role. See Karpenter IAM Deadlock: How We Broke Our EKS Cluster with a Terraform Apply for the full breakdown and fix.

Pods not consolidating. Consolidation requires pods to have PodDisruptionBudgets that allow disruption, or no PDB at all. If all pods in a node have PDBs preventing eviction, the node won't consolidate. Check kubectl get pdb -A.

Spot interruptions not handled. Karpenter handles spot interruptions via an SQS queue that receives EC2 interruption notices. If the queue wasn't created (the CloudFormation stack step was skipped), nodes will be terminated without cordon/drain. Verify the queue exists: aws sqs list-queues | grep ${CLUSTER_NAME}.


Frequently Asked Questions

Should I remove Cluster Autoscaler before installing Karpenter?

Yes. Running both simultaneously causes conflicts — they both attempt to scale the same node groups. Drain and remove Cluster Autoscaler before deploying Karpenter. Karpenter manages unmanaged nodes (nodes it provisions directly), not node group scaling, so the two architectures don't coexist cleanly.

Can Karpenter manage existing managed node groups?

No. Karpenter provisions nodes directly via EC2, bypassing node groups entirely. Your existing managed node groups are unaffected and still need to be scaled by Cluster Autoscaler or manually — unless you migrate those workloads to Karpenter-managed nodes and scale the node group to zero.

What's the difference between NodePool and the old Provisioner?

Provisioner was the v0.x API, deprecated in v0.33 and removed in v1.0. NodePool is the stable v1 equivalent. The structure is similar but not identical — the requirements syntax is the same, but ttlSecondsUntilExpired is replaced by expireAfter, and AWSNodeTemplate is replaced by EC2NodeClass. Do not use v0.x YAML with a v1.x installation.

How do I configure Karpenter for GPU nodes?

Add a separate NodePool with requirements targeting GPU instance types:

yaml
1requirements:
2  - key: karpenter.k8s.aws/instance-category
3    operator: In
4    values: ["g", "p"]
5  - key: karpenter.sh/capacity-type
6    operator: In
7    values: ["on-demand"]  # GPU spot is scarce, use on-demand for reliability

Use a separate EC2NodeClass with an AL2023-GPU AMI alias and the appropriate NVIDIA driver bootstrap configuration in userData.

Does Karpenter support ARM (Graviton) instances?

Yes. Add arm64 to the kubernetes.io/arch requirement or create a dedicated ARM NodePool. Ensure your container images are multi-arch (or use separate node selectors to route arm64-compatible workloads). See AWS Graviton and ARM64 Migration Guide for the full migration approach.


Setting up Karpenter for a production EKS cluster and hitting edge cases? Talk to us at Coding Protocols — we've done this migration enough times to know where it breaks.

Related Topics

Karpenter
EKS
AWS
Autoscaling
Kubernetes
Platform Engineering
Helm

Read Next