How to Install Karpenter on EKS: A Production-Ready Setup Guide
A step-by-step guide to installing Karpenter v1.x on EKS using the current API — NodePool, EC2NodeClass, and Helm. Includes IAM setup, common failure modes, and what to verify before sending production traffic through it.

Karpenter replaces the Cluster Autoscaler for most EKS workloads. It provisions nodes faster, bins-packs more efficiently, and supports spot interruption handling out of the box. If you're still running Cluster Autoscaler on EKS, it's worth the migration.
This guide covers a production-ready Karpenter v1.x installation. If you've seen older tutorials using Provisioner and AWSNodeTemplate, those are the v0.x APIs — both were removed in v1.0. The current API uses NodePool and EC2NodeClass.
Prerequisites
- EKS cluster running Kubernetes 1.27+ (Karpenter v1.x requires 1.27 minimum)
eksctl,kubectl,helm, andawsCLI installed and configured- Cluster using VPC CNI (the default for EKS)
- Your AWS account ID and cluster name ready — you'll use these throughout
Set variables up front:
export CLUSTER_NAME="my-cluster"
export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
export AWS_REGION="us-east-1"
export KARPENTER_VERSION="1.3.3" # check https://github.com/aws/karpenter/releases for latest
export KARPENTER_NAMESPACE="kube-system"Step 1: Create the Karpenter IAM Role
Karpenter's controller needs IAM permissions to call EC2 APIs: describe instance types, launch instances, terminate instances, and tag resources. It also needs to pass an IAM role to the instances it launches.
Create the IAM role. In 2026, EKS Pod Identity is the preferred method for granting IAM permissions to pods. It is simpler than IRSA as it doesn't require OIDC provider management or service account annotations.
Option A: EKS Pod Identity (Recommended)
- Create the IAM role with a trust policy allowing
pods.eks.amazonaws.comto assume it. - Associate the role with the Karpenter service account:
aws eks create-pod-identity-association \
--cluster-name "${CLUSTER_NAME}" \
--namespace "${KARPENTER_NAMESPACE}" \
--service-account karpenter \
--role-arn "arn:aws:iam::${AWS_ACCOUNT_ID}:role/KarpenterControllerRole-${CLUSTER_NAME}"Option B: IRSA (Legacy)
If your cluster is running an older version or you prefer OIDC:
1eksctl create iamserviceaccount \
2 --cluster "${CLUSTER_NAME}" \
3 --namespace "${KARPENTER_NAMESPACE}" \
4 --name karpenter \
5 --role-name "KarpenterControllerRole-${CLUSTER_NAME}" \
6 --attach-policy-arn "arn:aws:iam::${AWS_ACCOUNT_ID}:policy/KarpenterControllerPolicy-${CLUSTER_NAME}" \
7 --approveBut first you need the policy. The official Karpenter CloudFormation stack handles this. Pull it:
aws cloudformation deploy \
--stack-name "Karpenter-${CLUSTER_NAME}" \
--template-file "https://raw.githubusercontent.com/aws/karpenter-provider-aws/v${KARPENTER_VERSION}/website/content/en/preview/getting-started/getting-started-with-karpenter/cloudformation.yaml" \
--capabilities CAPABILITY_NAMED_IAM \
--parameter-overrides "ClusterName=${CLUSTER_NAME}"This creates:
KarpenterControllerPolicy-<cluster>— EC2 permissions for the controllerKarpenterNodeRole-<cluster>— the IAM role attached to nodes Karpenter launchesKarpenterNodeInstanceProfile-<cluster>— the instance profile wrapping the node role
Important: the KarpenterNodeRole needs to be mapped in your aws-auth ConfigMap so that nodes Karpenter launches can join the cluster:
eksctl create iamidentitymapping \
--cluster "${CLUSTER_NAME}" \
--region "${AWS_REGION}" \
--arn "arn:aws:iam::${AWS_ACCOUNT_ID}:role/KarpenterNodeRole-${CLUSTER_NAME}" \
--username system:node:{{EC2PrivateDNSName}} \
--group system:bootstrappers,system:nodesStep 2: Tag Your Subnets and Security Groups
Karpenter discovers which subnets and security groups to use for new nodes via EC2 resource tags. Without these tags, Karpenter can't launch nodes.
Tag your node subnets:
1for SUBNET_ID in $(aws eks describe-cluster \
2 --name "${CLUSTER_NAME}" \
3 --query "cluster.resourcesVpcConfig.subnetIds[]" \
4 --output text); do
5 aws ec2 create-tags \
6 --resources "${SUBNET_ID}" \
7 --tags "Key=karpenter.sh/discovery,Value=${CLUSTER_NAME}"
8doneTag the cluster security group:
1CLUSTER_SG=$(aws eks describe-cluster \
2 --name "${CLUSTER_NAME}" \
3 --query "cluster.resourcesVpcConfig.clusterSecurityGroupId" \
4 --output text)
5
6aws ec2 create-tags \
7 --resources "${CLUSTER_SG}" \
8 --tags "Key=karpenter.sh/discovery,Value=${CLUSTER_NAME}"These tags are what the EC2NodeClass resource references when selecting subnets and security groups.
Step 3: Install Karpenter with Helm
Add the Karpenter Helm repo:
1helm registry logout public.ecr.aws 2>/dev/null || true
2
3helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter \
4 --version "${KARPENTER_VERSION}" \
5 --namespace "${KARPENTER_NAMESPACE}" \
6 --create-namespace \
7 --set "settings.clusterName=${CLUSTER_NAME}" \
8 --set "settings.interruptionQueue=${CLUSTER_NAME}" \
9 --set controller.resources.requests.cpu=1 \
10 --set controller.resources.requests.memory=1Gi \
11 --set controller.resources.limits.cpu=1 \
12 --set controller.resources.limits.memory=1Gi \
13 --waitKey flags:
settings.clusterName— tells Karpenter which cluster it's managingsettings.interruptionQueue— the SQS queue name for spot interruption events (created by the CloudFormation stack)- Resource requests/limits — Karpenter runs as a deployment; right-size it to avoid OOM on large clusters
Verify the controller is running:
kubectl get pods -n "${KARPENTER_NAMESPACE}" -l app.kubernetes.io/name=karpenterExpected output: two pods (controller and webhook) in Running state.
Step 4: Create an EC2NodeClass
EC2NodeClass defines the AWS-specific configuration for the nodes Karpenter provisions: AMI family, subnets, security groups, IAM instance profile, and block device mappings.
1apiVersion: karpenter.k8s.aws/v1
2kind: EC2NodeClass
3metadata:
4 name: default
5spec:
6 amiFamily: AL2023
7 role: "KarpenterNodeRole-${CLUSTER_NAME}"
8 subnetSelectorTerms:
9 - tags:
10 karpenter.sh/discovery: "${CLUSTER_NAME}"
11 securityGroupSelectorTerms:
12 - tags:
13 karpenter.sh/discovery: "${CLUSTER_NAME}"
14 amiSelectorTerms:
15 - alias: al2023@latest
16 blockDeviceMappings:
17 - deviceName: /dev/xvda
18 ebs:
19 volumeSize: 50Gi
20 volumeType: gp3
21 encrypted: trueApply it:
envsubst < ec2nodeclass.yaml | kubectl apply -f -Notes:
AL2023is Amazon Linux 2023 — the recommended AMI family for new EKS clusters.AL2(Amazon Linux 2) is still supported but entering maintenance.amiSelectorTermswithalias: al2023@latestautomatically tracks the latest EKS-optimised AL2023 AMI for your cluster version. This means node refreshes pick up security patches automatically.- Always set
encrypted: trueon block devices. The default is unencrypted.
Step 5: Create a NodePool
NodePool defines the scheduling constraints for nodes Karpenter provisions: instance types, zones, capacity type (on-demand vs spot), taints, labels, and expiry.
1apiVersion: karpenter.sh/v1
2kind: NodePool
3metadata:
4 name: default
5spec:
6 template:
7 metadata:
8 labels:
9 billing-team: platform
10 spec:
11 nodeClassRef:
12 group: karpenter.k8s.aws
13 kind: EC2NodeClass
14 name: default
15 requirements:
16 - key: kubernetes.io/arch
17 operator: In
18 values: ["amd64"]
19 - key: kubernetes.io/os
20 operator: In
21 values: ["linux"]
22 - key: karpenter.sh/capacity-type
23 operator: In
24 values: ["spot", "on-demand"]
25 - key: karpenter.k8s.aws/instance-category
26 operator: In
27 values: ["c", "m", "r"]
28 - key: karpenter.k8s.aws/instance-generation
29 operator: Gt
30 values: ["2"]
31 expireAfter: 720h # 30 days — force node rotation for security patches
32 limits:
33 cpu: 1000
34 memory: 1000Gi
35 disruption:
36 consolidationPolicy: WhenEmptyOrUnderutilized
37 consolidateAfter: 1mApply it:
kubectl apply -f nodepool.yamlKey decisions in this config:
Capacity type: ["spot", "on-demand"] allows Karpenter to use spot when available and fall back to on-demand. Spot can be 70–90% cheaper. If you need guaranteed capacity for a workload, add a separate NodePool with values: ["on-demand"] and use node selectors to route critical pods there.
Instance categories: c (compute), m (general purpose), r (memory) covers the common instance families. Restricting to generation > 2 avoids older, less cost-effective instance types.
expireAfter: Node expiry forces regular rotation, ensuring nodes pick up the latest AMI (with OS security patches). 30 days is a reasonable default for most environments.
consolidationPolicy: WhenEmptyOrUnderutilized consolidates nodes that are empty or where pods can be rescheduled more efficiently. This is the most aggressive consolidation setting — it reduces cost but increases pod disruption. Use WhenEmpty if your workloads are sensitive to disruption.
Step 6: Verify Karpenter Is Working
Deploy a test workload that requests more capacity than your existing nodes have:
1kubectl apply -f - <<EOF
2apiVersion: apps/v1
3kind: Deployment
4metadata:
5 name: karpenter-test
6spec:
7 replicas: 5
8 selector:
9 matchLabels:
10 app: karpenter-test
11 template:
12 metadata:
13 labels:
14 app: karpenter-test
15 spec:
16 containers:
17 - name: pause
18 image: public.ecr.aws/eks-distro/kubernetes/pause:3.7
19 resources:
20 requests:
21 cpu: 500m
22 memory: 512Mi
23EOFWatch for node provisioning:
kubectl get nodes -wWithin 30–60 seconds you should see a new node in NotReady, transitioning to Ready. Check Karpenter's logs if it doesn't:
kubectl logs -n "${KARPENTER_NAMESPACE}" -l app.kubernetes.io/name=karpenter -c controller --tail=50Clean up after verifying:
kubectl delete deployment karpenter-testCommon Failure Modes
Nodes launch but don't join the cluster. Almost always an aws-auth ConfigMap issue — the KarpenterNodeRole isn't mapped. Verify with kubectl describe configmap aws-auth -n kube-system.
"No instance type satisfies requirements." Your NodePool requirements are too restrictive. Check that your subnets have available IPs in the requested AZs, that the instance categories you specified are available in your region, and that your limits haven't been reached.
IAM deadlock on Terraform apply. If you manage the Karpenter IAM role with Terraform and the controller is running when you apply, you can hit a race condition where the controller's in-flight EC2 calls hold a lock on the role. See Karpenter IAM Deadlock: How We Broke Our EKS Cluster with a Terraform Apply for the full breakdown and fix.
Pods not consolidating. Consolidation requires pods to have PodDisruptionBudgets that allow disruption, or no PDB at all. If all pods in a node have PDBs preventing eviction, the node won't consolidate. Check kubectl get pdb -A.
Spot interruptions not handled. Karpenter handles spot interruptions via an SQS queue that receives EC2 interruption notices. If the queue wasn't created (the CloudFormation stack step was skipped), nodes will be terminated without cordon/drain. Verify the queue exists: aws sqs list-queues | grep ${CLUSTER_NAME}.
Frequently Asked Questions
Should I remove Cluster Autoscaler before installing Karpenter?
Yes. Running both simultaneously causes conflicts — they both attempt to scale the same node groups. Drain and remove Cluster Autoscaler before deploying Karpenter. Karpenter manages unmanaged nodes (nodes it provisions directly), not node group scaling, so the two architectures don't coexist cleanly.
Can Karpenter manage existing managed node groups?
No. Karpenter provisions nodes directly via EC2, bypassing node groups entirely. Your existing managed node groups are unaffected and still need to be scaled by Cluster Autoscaler or manually — unless you migrate those workloads to Karpenter-managed nodes and scale the node group to zero.
What's the difference between NodePool and the old Provisioner?
Provisioner was the v0.x API, deprecated in v0.33 and removed in v1.0. NodePool is the stable v1 equivalent. The structure is similar but not identical — the requirements syntax is the same, but ttlSecondsUntilExpired is replaced by expireAfter, and AWSNodeTemplate is replaced by EC2NodeClass. Do not use v0.x YAML with a v1.x installation.
How do I configure Karpenter for GPU nodes?
Add a separate NodePool with requirements targeting GPU instance types:
1requirements:
2 - key: karpenter.k8s.aws/instance-category
3 operator: In
4 values: ["g", "p"]
5 - key: karpenter.sh/capacity-type
6 operator: In
7 values: ["on-demand"] # GPU spot is scarce, use on-demand for reliabilityUse a separate EC2NodeClass with an AL2023-GPU AMI alias and the appropriate NVIDIA driver bootstrap configuration in userData.
Does Karpenter support ARM (Graviton) instances?
Yes. Add arm64 to the kubernetes.io/arch requirement or create a dedicated ARM NodePool. Ensure your container images are multi-arch (or use separate node selectors to route arm64-compatible workloads). See AWS Graviton and ARM64 Migration Guide for the full migration approach.
Setting up Karpenter for a production EKS cluster and hitting edge cases? Talk to us at Coding Protocols — we've done this migration enough times to know where it breaks.


