Kubernetes

Install Karpenter on EKS for Automatic Node Provisioning

Advanced45 min to complete18 min read

Set up the required IAM roles, install Karpenter on EKS, configure an EC2NodeClass and NodePool, and watch Karpenter provision a new node within 60–90 seconds of an unschedulable pod appearing.

Before you begin

  • An existing EKS cluster
  • kubectl configured with cluster-admin access
  • Helm 3 installed
  • AWS CLI configured with admin credentials
  • eksctl installed (optional but useful)
  • Your cluster's OIDC provider ID
Karpenter
EKS
AWS
Autoscaling
Kubernetes
Helm
FinOps

The Cluster Autoscaler works but it has fundamental limitations: it scales node groups, not individual nodes, which means you're constrained to the instance types you pre-configured. Karpenter replaces this with a smarter model — it reads the actual resource requirements of pending pods and provisions exactly the right instance type for them, choosing Spot or On-Demand based on your policy, and consolidating underutilised nodes automatically.

The result: faster scale-out (60–90 seconds vs 4–5 minutes with Cluster Autoscaler), better bin-packing, and lower bills.

Step 1: Set environment variables

Export these values — they're used throughout the tutorial:

bash
1export CLUSTER_NAME=my-cluster
2export AWS_DEFAULT_REGION=us-east-1
3export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
4export KARPENTER_VERSION=1.12.1
5export KARPENTER_NAMESPACE=kube-system
6
7# Get OIDC ID
8export OIDC_ID=$(aws eks describe-cluster \
9  --name $CLUSTER_NAME \
10  --query "cluster.identity.oidc.issuer" \
11  --output text | sed 's|.*/||')
12
13echo "Account: $AWS_ACCOUNT_ID, OIDC: $OIDC_ID"

Step 2: Create the KarpenterNode IAM role

Karpenter-provisioned nodes need an IAM role with permissions to join the cluster and pull from ECR:

bash
1cat <<EOF > node-trust.json
2{
3  "Version": "2012-10-17",
4  "Statement": [{
5    "Effect": "Allow",
6    "Principal": {"Service": "ec2.amazonaws.com"},
7    "Action": "sts:AssumeRole"
8  }]
9}
10EOF
11
12aws iam create-role \
13  --role-name "KarpenterNodeRole-${CLUSTER_NAME}" \
14  --assume-role-policy-document file://node-trust.json
15
16for policy in AmazonEKSWorkerNodePolicy AmazonEKS_CNI_Policy \
17              AmazonEC2ContainerRegistryReadOnly AmazonSSMManagedInstanceCore; do
18  aws iam attach-role-policy \
19    --role-name "KarpenterNodeRole-${CLUSTER_NAME}" \
20    --policy-arn "arn:aws:iam::aws:policy/${policy}"
21done
22

Step 3: Create the KarpenterController IAM role

The Karpenter controller pod needs permissions to manage EC2 instances, query pricing, and receive Spot interruption events:

bash
1cat <<EOF > controller-policy.json
2{
3  "Version": "2012-10-17",
4  "Statement": [
5    {
6      "Sid": "Karpenter",
7      "Effect": "Allow",
8      "Action": [
9        "ec2:CreateFleet", "ec2:CreateLaunchTemplate", "ec2:CreateTags",
10        "ec2:DeleteLaunchTemplate",
11        "ec2:DescribeImages", "ec2:DescribeInstances", "ec2:DescribeInstanceStatus",
12        "ec2:DescribeInstanceTypeOfferings", "ec2:DescribeInstanceTypes",
13        "ec2:DescribeLaunchTemplates", "ec2:DescribeSecurityGroups",
14        "ec2:DescribeSpotPriceHistory", "ec2:DescribeSubnets",
15        "ec2:DescribeCapacityReservations", "ec2:DescribePlacementGroups",
16        "ec2:RunInstances", "ec2:TerminateInstances",
17        "iam:PassRole", "iam:GetInstanceProfile",
18        "iam:CreateInstanceProfile", "iam:TagInstanceProfile",
19        "iam:AddRoleToInstanceProfile", "iam:RemoveRoleFromInstanceProfile",
20        "iam:DeleteInstanceProfile", "iam:ListInstanceProfiles",
21        "pricing:GetProducts",
22        "sqs:DeleteMessage", "sqs:GetQueueUrl", "sqs:ReceiveMessage",
23        "ssm:GetParameter",
24        "eks:DescribeCluster"
25      ],
26      "Resource": "*"
27    }
28  ]
29}
30EOF
31
32aws iam create-policy \
33  --policy-name "KarpenterControllerPolicy-${CLUSTER_NAME}" \
34  --policy-document file://controller-policy.json
35
36cat <<EOF > controller-trust.json
37{
38  "Version": "2012-10-17",
39  "Statement": [{
40    "Effect": "Allow",
41    "Principal": {
42      "Federated": "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/oidc.eks.${AWS_DEFAULT_REGION}.amazonaws.com/id/${OIDC_ID}"
43    },
44    "Action": "sts:AssumeRoleWithWebIdentity",
45    "Condition": {
46      "StringEquals": {
47        "oidc.eks.${AWS_DEFAULT_REGION}.amazonaws.com/id/${OIDC_ID}:sub": "system:serviceaccount:${KARPENTER_NAMESPACE}:karpenter"
48      }
49    }
50  }]
51}
52EOF
53
54aws iam create-role \
55  --role-name "KarpenterControllerRole-${CLUSTER_NAME}" \
56  --assume-role-policy-document file://controller-trust.json
57
58aws iam attach-role-policy \
59  --role-name "KarpenterControllerRole-${CLUSTER_NAME}" \
60  --policy-arn "arn:aws:iam::${AWS_ACCOUNT_ID}:policy/KarpenterControllerPolicy-${CLUSTER_NAME}"

Step 4: Tag subnets and security groups for discovery

Karpenter discovers which subnets and security groups to use via tags. Tag your private subnets and the cluster node security group:

bash
1# List your cluster subnets and security group
2aws eks describe-cluster --name $CLUSTER_NAME \
3  --query "cluster.resourcesVpcConfig"
4
5# Tag each private subnet (repeat for each subnet ID)
6aws ec2 create-tags \
7  --resources subnet-XXXXXXXXXXXXXXXXX \
8  --tags Key=karpenter.sh/discovery,Value=$CLUSTER_NAME
9
10# Tag the cluster security group
11aws ec2 create-tags \
12  --resources sg-XXXXXXXXXXXXXXXXX \
13  --tags Key=karpenter.sh/discovery,Value=$CLUSTER_NAME

Step 5: Grant the Karpenter node role access to the cluster

Add the Karpenter node role to aws-auth so that nodes provisioned by Karpenter can join the cluster. Use eksctl to append the mapping safely — kubectl patch --type merge replaces the entire mapRoles key and will silently delete all existing node group mappings:

bash
1eksctl create iamidentitymapping \
2  --cluster "${CLUSTER_NAME}" \
3  --region "${AWS_DEFAULT_REGION}" \
4  --arn "arn:aws:iam::${AWS_ACCOUNT_ID}:role/KarpenterNodeRole-${CLUSTER_NAME}" \
5  --username "system:node:{{EC2PrivateDNSName}}" \
6  --group "system:bootstrappers" \
7  --group "system:nodes"

If you don't have eksctl, edit the ConfigMap manually with kubectl edit configmap aws-auth -n kube-system and append the entry under mapRoles.

Step 6: Install Karpenter with Helm

Since v0.17, Karpenter is distributed via an OCI registry — there is no helm repo add step:

bash
1helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter \
2  --version "${KARPENTER_VERSION}" \
3  --namespace "${KARPENTER_NAMESPACE}" \
4  --create-namespace \
5  --set settings.clusterName="${CLUSTER_NAME}" \
6  --set settings.interruptionQueue="${CLUSTER_NAME}" \
7  --set "serviceAccount.annotations.eks\.amazonaws\.com/role-arn=arn:aws:iam::${AWS_ACCOUNT_ID}:role/KarpenterControllerRole-${CLUSTER_NAME}" \
8  --set controller.resources.requests.cpu=1 \
9  --set controller.resources.requests.memory=1Gi \
10  --set controller.resources.limits.cpu=1 \
11  --set controller.resources.limits.memory=1Gi \
12  --wait

The serviceAccount.annotations line is for IRSA-based authentication. If your cluster uses EKS Pod Identity instead, create a Pod Identity Association for the Karpenter service account and omit this annotation.

settings.interruptionQueue requires a real SQS queue named after your cluster. If you have not created it yet, either omit this flag (Spot interruption handling will be disabled) or create the queue and EventBridge rules now — see the Karpenter getting started guide for the CloudFormation template that creates both.

Verify the controller is running:

bash
kubectl get pods -n $KARPENTER_NAMESPACE -l app.kubernetes.io/name=karpenter

Step 7: Create an EC2NodeClass

The EC2NodeClass tells Karpenter which AMI, IAM role, subnets, and security groups to use when launching nodes:

bash
1cat <<EOF | kubectl apply -f -
2apiVersion: karpenter.k8s.aws/v1
3kind: EC2NodeClass
4metadata:
5  name: default
6spec:
7  amiSelectorTerms:
8  - alias: al2023@latest
9  role: KarpenterNodeRole-${CLUSTER_NAME}
10  subnetSelectorTerms:
11  - tags:
12      karpenter.sh/discovery: ${CLUSTER_NAME}
13  securityGroupSelectorTerms:
14  - tags:
15      karpenter.sh/discovery: ${CLUSTER_NAME}
16  tags:
17    karpenter.sh/discovery: ${CLUSTER_NAME}
18EOF

Step 8: Create a NodePool

A NodePool defines instance type preferences, capacity type (Spot vs On-Demand), and consolidation policy:

bash
1cat <<EOF | kubectl apply -f -
2apiVersion: karpenter.sh/v1
3kind: NodePool
4metadata:
5  name: default
6spec:
7  template:
8    spec:
9      nodeClassRef:
10        group: karpenter.k8s.aws
11        kind: EC2NodeClass
12        name: default
13      requirements:
14      - key: karpenter.sh/capacity-type
15        operator: In
16        values: ["spot", "on-demand"]
17      - key: kubernetes.io/arch
18        operator: In
19        values: ["amd64"]
20      - key: karpenter.k8s.aws/instance-category
21        operator: In
22        values: ["c", "m", "r"]
23      - key: karpenter.k8s.aws/instance-generation
24        operator: Gt
25        values: ["2"]
26  limits:
27    cpu: 1000
28    memory: 4000Gi
29  disruption:
30    consolidationPolicy: WhenEmptyOrUnderutilized
31    consolidateAfter: 1m
32EOF

By default, Karpenter does not consolidate Spot-to-Spot node replacements. To enable it (saving cost by downsizing Spot nodes), add --set settings.featureGates.spotToSpotConsolidation=true to the Helm install command.

Step 9: Test node provisioning

Deploy a workload that exceeds your current cluster capacity and watch Karpenter provision a node:

bash
1cat <<EOF | kubectl apply -f -
2apiVersion: apps/v1
3kind: Deployment
4metadata:
5  name: karpenter-test
6spec:
7  replicas: 5
8  selector:
9    matchLabels:
10      app: karpenter-test
11  template:
12    metadata:
13      labels:
14        app: karpenter-test
15    spec:
16      containers:
17      - name: app
18        image: public.ecr.aws/amazonlinux/amazonlinux:2023
19        command: ["sleep", "3600"]
20        resources:
21          requests:
22            cpu: "1"
23            memory: "1Gi"
24EOF

Watch new nodes join the cluster — typically 60–90 seconds on first launch while the EC2 instance boots and runs the bootstrap script:

bash
kubectl get nodes --watch

Check Karpenter's logs to see its decision-making:

bash
kubectl logs -l app.kubernetes.io/name=karpenter \
  -n $KARPENTER_NAMESPACE --follow

Step 10: Clean up the test

bash
kubectl delete deployment karpenter-test

Karpenter will consolidate and terminate the nodes it provisioned within ~1 minute (per the consolidateAfter: 1m setting).

What you built

Karpenter is watching your cluster for unschedulable pods. When one appears, it selects the optimal instance type for the pending workload, launches it directly via the EC2 Fleet API, and has it ready within 60–90 seconds on a typical first launch. When pods are removed, Karpenter consolidates underutilised nodes and terminates them — cutting your EC2 bill proportionally. Spot preference in the NodePool means most nodes will run at 60–80% discount when Spot is available.

We built Podscape to simplify Kubernetes workflows like this — logs, events, and cluster state in one interface, without switching tools.

Struggling with this in production?

We help teams fix these exact issues. Our engineers have deployed these patterns across production environments at scale.