Actions Runner Controller (ARC): Self-Hosted GitHub Actions on Kubernetes

GitHub-hosted runners are the right default — managed infrastructure, no maintenance, pay-per-minute. But there are cases where they stop being the right choice:

Private network access: your build needs to reach an internal Artifactory, a dev database, or a private Kubernetes API. VPN tunnels from hosted runners are painful.
Larger compute: the largest hosted runner is 64-core/256GB. If you're building large monorepos, running integration tests with real containers, or doing ML model builds, you want your own instance sizes.
GPU jobs: GitHub-hosted GPU runners exist but are Enterprise-only and limited to specific regions. If you need NVIDIA A10G or custom GPU instance types, or you're not on Enterprise, self-hosted is your only option.
Cost at scale: GitHub-hosted runners are priced per-minute. At high build volume, the per-minute cost of running your own compute (especially spot instances) is materially lower.

Actions Runner Controller (ARC) v2 is the official GitHub-maintained solution. It runs ephemeral runner pods on Kubernetes — one pod per job, terminated when the job completes, scaled to zero when the queue is empty.

ARC v2 Architecture

ARC v2 was a complete rewrite from the community-maintained v1 (summerwind/actions-runner-controller). The key architectural difference: v2 uses GitHub's runner scale set API and just-in-time (JIT) runner tokens instead of a webhook server.

Two components:

Controller manager (cluster-wide, one per cluster):

Watches AutoscalingRunnerSet resources
Manages the lifecycle of AutoscalingListener pods
Does not need inbound webhooks — it's purely outbound to GitHub's API

Runner scale set (one per team/namespace/use-case):

An AutoscalingListener pod maintains a long-poll connection to GitHub
When a job is queued for this scale set, ARC creates an EphemeralRunner pod
The pod registers with GitHub using a single-use JIT token, runs the job, then terminates

Each runner is truly ephemeral: no shared state between jobs, no credential leakage, no "dirty runner" problems from previous job artifacts.

Step 1: GitHub App Setup

GitHub App authentication is recommended over PAT — it's scoped, auditable, and doesn't expire.

In your GitHub organization: Settings → Developer settings → GitHub Apps → New GitHub App

Set:

Homepage URL: any valid URL (your org's GitHub URL)
Webhook: uncheck "Active" — ARC doesn't use webhooks
Permissions (for organization-level runners):
- Organization permissions: Self-hosted runners → Read & Write
Permissions (for repository-level runners instead):
- Repository permissions: Administration → Read & Write
Where can this GitHub App be installed: Only on this account

After creating the app:

Note the App ID (visible on the app settings page)
Generate a private key (scroll to bottom → Generate a private key)
Install the app on your organization (App settings → Install App → your org)
Note the Installation ID from the URL after installation: https://github.com/organizations/your-org/settings/installations/789012

Create the Kubernetes secret:

bash

1kubectl create namespace arc-systems
2kubectl create namespace arc-runners
3
4kubectl create secret generic arc-github-secret \
5  --namespace arc-runners \
6  --from-literal=github_app_id="123456" \
7  --from-literal=github_app_installation_id="789012" \
8  --from-file=github_app_private_key=./private-key.pem

Step 2: Install the Controller

ARC v2 distributes via OCI Helm charts on GHCR:

Check the latest chart version first:

bash

helm show chart \
  oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set-controller \
  | grep ^version

Then install:

bash

helm install arc \
  oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set-controller \
  --namespace arc-systems \
  --create-namespace \
  --set replicaCount=2

replicaCount: 2 gives you a leader-elected HA controller. For a single-node dev cluster, 1 is fine.

Step 3: Install a Runner Scale Set

yaml

1# arc-runner-set-values.yaml
2githubConfigUrl: "https://github.com/your-org"
3githubConfigSecret: arc-github-secret
4
5minRunners: 0       # scale to zero when no jobs queued
6maxRunners: 20      # cap on concurrent runner pods
7
8runnerScaleSetName: "arc-runner-set"   # used in workflows: runs-on: arc-runner-set
9
10containerMode:
11  type: "kubernetes"   # no Docker daemon required — see below
12  kubernetesModeWorkVolumeClaim:
13    accessModes: ["ReadWriteOnce"]
14    storageClassName: "gp3"
15    resources:
16      requests:
17        storage: 5Gi
18
19template:
20  spec:
21    securityContext:
22      fsGroup: 1001
23    containers:
24      - name: runner
25        image: ghcr.io/actions/actions-runner:latest
26        resources:
27          requests:
28            cpu: 500m
29            memory: 1Gi
30          limits:
31            cpu: 4
32            memory: 8Gi

bash

helm install arc-runner-set \
  oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set \
  --namespace arc-runners \
  --create-namespace \
  --values arc-runner-set-values.yaml

githubConfigUrl scopes the runners:

https://github.com/your-org — organization-level (any repo in the org can use them)
https://github.com/your-org/your-repo — repository-level
https://github.com/enterprises/your-enterprise — enterprise-level

Container Mode: Kubernetes vs Docker-in-Docker

This is the most consequential configuration choice.

Kubernetes Mode (Recommended)

In Kubernetes mode, the runner pod doesn't run a Docker daemon. When a workflow step uses a container image (via container: or service containers), ARC uses the Kubernetes API to create additional pods in the same namespace. The work directory is a shared PVC mounted into both the runner pod and the work pods.

There are two Kubernetes mode variants:

kubernetes — uses a PVC for the shared work volume. Requires a StorageClass that supports dynamic provisioning.
kubernetes-novolume — uses lifecycle hooks instead of a persistent volume. Better for clusters where PVC provisioning is slow or unavailable (e.g., clusters without a dynamic provisioner).

yaml

# kubernetes-novolume mode — no PVC required
containerMode:
  type: "kubernetes-novolume"

Pros:

No privileged containers
Each step container is a native K8s pod — you get full node scheduling (GPU node selectors, resource quotas, spot tolerations)
Namespace-scoped RBAC is sufficient

Cons:

docker build inside workflows doesn't work natively — there is no Docker daemon in the runner pod. Use Kaniko or Buildah as a build step instead.

For CI pipelines that build Docker images in Kubernetes mode, Kaniko is the standard approach: run it as a workflow step, point it at your Dockerfile, and push directly to ECR using the runner's IRSA credentials. Kaniko requires no Docker socket — it builds OCI images entirely in userspace. A registry-based layer cache (--cache=true --cache-repo=your-ecr-repo/cache) gives good reuse across builds without a local Docker cache.

docker/build-push-action requires a Docker daemon and does not work in Kubernetes mode. If you need docker build semantics directly in your workflow YAML, use DinD mode instead (see below).

Docker-in-Docker Mode

DinD runs a privileged Docker sidecar inside the runner pod. docker build and docker run work as expected.

yaml

containerMode:
  type: "dind"

Privileged containers are a security concern on multi-tenant clusters — a privileged escape in a build job could affect the node. Reserve DinD for dedicated build node pools with taints that prevent other workloads from sharing the nodes:

yaml

1template:
2  spec:
3    tolerations:
4      - key: "build-nodes"
5        operator: "Equal"
6        value: "true"
7        effect: "NoSchedule"
8    nodeSelector:
9      role: build

Using Runners in Workflows

yaml

1jobs:
2  build:
3    runs-on: arc-runner-set   # matches runnerScaleSetName in Helm values
4    steps:
5      - uses: actions/checkout@v4
6      - name: Build
7        run: make build

The runs-on value must exactly match runnerScaleSetName. The runner pod is created when the job is queued, registers with GitHub using a JIT token, runs the job, and terminates. The next job gets a fresh pod.

Custom Runner Images

The default ghcr.io/actions/actions-runner image is minimal. For platform engineering workflows that need kubectl, helm, aws, terraform, or other tooling pre-installed:

dockerfile

1FROM ghcr.io/actions/actions-runner:latest
2
3USER root
4
5RUN apt-get update && apt-get install -y \
6    curl \
7    unzip \
8    && rm -rf /var/lib/apt/lists/*
9
10# kubectl
11RUN curl -LO "https://dl.k8s.io/release/$(curl -sL https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl" \
12    && install -m 0755 kubectl /usr/local/bin/kubectl \
13    && rm kubectl
14
15# helm
16RUN curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
17
18# AWS CLI v2
19RUN curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o /tmp/awscliv2.zip \
20    && unzip /tmp/awscliv2.zip -d /tmp \
21    && /tmp/aws/install \
22    && rm -rf /tmp/aws /tmp/awscliv2.zip
23
24USER runner

Reference the custom image in the runner scale set values:

yaml

template:
  spec:
    containers:
      - name: runner
        image: 123456789.dkr.ecr.us-east-1.amazonaws.com/arc-runner:latest

Pin image tags in production — latest will pull a new image on every pod creation.

AWS Integration: IRSA for EKS Runners

If your CI workflows need AWS access (push to ECR, deploy to EKS, read from S3), attach an IAM role to the runner pod's service account via IRSA rather than storing AWS credentials as secrets.

Create a service account for the runner pods:

yaml

1apiVersion: v1
2kind: ServiceAccount
3metadata:
4  name: arc-runner
5  namespace: arc-runners
6  annotations:
7    eks.amazonaws.com/role-arn: arn:aws:iam::123456789:role/github-runner-role

Reference it in the runner scale set values:

yaml

template:
  spec:
    serviceAccountName: arc-runner
    containers:
      - name: runner
        image: ghcr.io/actions/actions-runner:latest

The runner pod will automatically receive short-lived AWS credentials via the projected service account token. In your workflow, aws-actions/configure-aws-credentials@v4 picks these up automatically when you set role-to-assume:

yaml

- name: Configure AWS credentials
  uses: aws-actions/configure-aws-credentials@v4
  with:
    role-to-assume: arn:aws:iam::123456789:role/github-runner-role
    aws-region: us-east-1

This eliminates the need for AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY secrets entirely for self-hosted runner workflows — the IRSA token is issued by the EKS OIDC provider, not GitHub's. See GitHub Actions CI/CD for EKS for the full OIDC setup if you're using a hybrid of hosted and self-hosted runners.

Multi-Team Setup: Namespace Isolation

Install one controller, multiple runner scale sets — one per team or environment. Each namespace needs its own copy of the GitHub App secret, since Kubernetes secrets are namespace-scoped:

bash

1# Copy the GitHub App secret into each team namespace
2kubectl create secret generic arc-github-secret \
3  --namespace arc-platform \
4  --from-literal=github_app_id="123456" \
5  --from-literal=github_app_installation_id="789012" \
6  --from-file=github_app_private_key=./private-key.pem
7
8kubectl create secret generic arc-github-secret \
9  --namespace arc-appteam \
10  --from-literal=github_app_id="123456" \
11  --from-literal=github_app_installation_id="789012" \
12  --from-file=github_app_private_key=./private-key.pem

bash

1# Platform team runners (large compute, cluster access)
2helm install arc-platform \
3  oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set \
4  --namespace arc-platform \
5  --create-namespace \
6  --set githubConfigUrl=https://github.com/your-org \
7  --set githubConfigSecret=arc-github-secret \
8  --set runnerScaleSetName=platform-runners \
9  --set maxRunners=10 \
10  --set template.spec.serviceAccountName=arc-platform-runner
11
12# Application team runners (standard compute, no cluster access)
13helm install arc-appteam \
14  oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set \
15  --namespace arc-appteam \
16  --create-namespace \
17  --set githubConfigUrl=https://github.com/your-org \
18  --set githubConfigSecret=arc-github-secret \
19  --set runnerScaleSetName=app-runners \
20  --set maxRunners=30 \
21  --set template.spec.serviceAccountName=arc-appteam-runner

Each namespace has its own service account with scoped IAM permissions and K8s RBAC. The platform team's runners can run kubectl against the cluster; the app team's runners can only push to ECR. Workflows reference the scale set by name:

yaml

# Platform team workflow
runs-on: platform-runners

# App team workflow
runs-on: app-runners

Caching on Ephemeral Runners

Each runner pod starts clean — no warm local cache from previous builds. actions/cache still works because it stores to and restores from GitHub's remote cache backend, not the runner's local disk. The cost is a network round-trip to restore the cache on every job.

For the common cases:

Docker layer caching (DinD mode): use registry-based cache with docker/build-push-action — set cache-from: type=registry,ref=your-registry/cache-repo and cache-to: type=registry,ref=your-registry/cache-repo,mode=max. Layer cache persists in ECR between builds without relying on the runner's disk.

Docker layer caching (Kubernetes mode): docker/build-push-action doesn't apply — there's no Docker daemon. Use Kaniko with --cache=true --cache-repo=your-ecr-repo/cache to get equivalent registry-based layer reuse.

Node.js / npm: actions/cache with ~/.npm path works fine. On a warm EKS node, the cache restore from GitHub is typically fast enough that the cold-runner overhead is minimal.

Large dependency caches (Maven, Gradle, Python): if cache restore is slow, consider pinning runners to a labelled node group using nodeAffinity so the work PVC is provisioned on the same node and can be reused across jobs. Label a set of nodes role: ci-cache and add a requiredDuringSchedulingIgnoredDuringExecution node affinity to the runner template. This trades full ephemeral isolation for cache warmth — acceptable for trusted internal builds on a dedicated node pool.

Monitoring

The ARC controller exposes 13 Prometheus metrics at port 8080 on path /metrics. They fall into four categories per the official docs:

Controller gauges (emitted by the controller manager pod):

Pending ephemeral runners (runners created, not yet registered)
Running ephemeral runners (runners actively registered)
Failed ephemeral runners
Running listener pods

Listener gauges (emitted by the AutoscalingListener pod per scale set):

Assigned jobs
Registered runners
Desired runners

Listener counters: started jobs, completed jobs

Listener histograms: job startup duration, job execution duration

The exact metric names are versioned with the chart and not published in the official docs — inspect http://localhost:8080/metrics after port-forwarding to the controller pod to see the actual names for your installed version:

bash

kubectl port-forward -n arc-systems \
  $(kubectl get pods -n arc-systems -l app.kubernetes.io/name=gha-runner-scale-set-controller -o name | head -1) \
  8080:8080

curl -s http://localhost:8080/metrics | grep -v "^#" | head -30

A Prometheus scrape config for the controller pod:

yaml

1- job_name: arc-controller
2  kubernetes_sd_configs:
3    - role: pod
4      namespaces:
5        names: [arc-systems]
6  relabel_configs:
7    - source_labels: [__meta_kubernetes_pod_label_app_kubernetes_io_name]
8      regex: gha-runner-scale-set-controller
9      action: keep
10  metrics_path: /metrics
11  scheme: http

The key alert signal: desired runners increasing but registered runners not keeping up means runner pods are failing to start. Check pod events in the runner namespace — common causes are insufficient node capacity, image pull errors, or PVC provisioning timeout.

Try the toolkit: For the Kubernetes RBAC needed to scope ARC runner service accounts per team, use the RBAC Generator to produce the correct Role and RoleBinding manifests.

Setting up ARC for a multi-team platform on EKS? Talk to us at Coding Protocols. We design and implement self-hosted runner infrastructure with the right isolation boundaries, IAM scoping, and scaling configuration for your build volume.

Actions Runner Controller: Self-Hosted GitHub Actions Runners on Kubernetes

ARC v2 Architecture

Step 1: GitHub App Setup

Step 2: Install the Controller

Step 3: Install a Runner Scale Set

Container Mode: Kubernetes vs Docker-in-Docker

Kubernetes Mode (Recommended)

Docker-in-Docker Mode

Using Runners in Workflows

Custom Runner Images

AWS Integration: IRSA for EKS Runners

Multi-Team Setup: Namespace Isolation

Caching on Ephemeral Runners

Monitoring

Related Topics

Practice this

Read Next

GitHub Actions CI/CD for EKS: OIDC, ECR, and Helm Deployments Without Static Credentials

Helm Advanced Patterns: Chart Development and Production Operations

Argo CD: GitOps Continuous Delivery for Kubernetes