CI/CD
16 min readMay 5, 2026

GitHub Actions CI/CD for Kubernetes: End-to-End Pipeline Guide

A Kubernetes CI/CD pipeline has more moving parts than a traditional deploy: image build, vulnerability scan, image sign, Helm chart update, GitOps sync, and smoke test — all chained together with the right secrets, caching, and failure handling. Here's the complete pipeline.

AJ
Ajeet Yadav
Platform & Cloud Engineer
GitHub Actions CI/CD for Kubernetes: End-to-End Pipeline Guide

A naive Kubernetes CI/CD pipeline builds a Docker image, pushes it to a registry, and runs kubectl set image. This works, but it bypasses every safety control that matters in production: vulnerability scanning, image signing, GitOps reconciliation, and staged rollout.

A production pipeline is a chain: test → build → scan → sign → push → update GitOps repo → sync → verify. Each stage has failure conditions that should block the next stage. Getting this chain right requires understanding each component's role and the common failure modes at each step.


Pipeline Architecture

PR → Test → Build image → Scan (Trivy) → Push to ECR
                                              ↓
main → same chain → Sign (cosign) → Update Helm values → Argo CD sync → Smoke test

On pull requests: Run tests + build + scan (no push, no deploy). PRs get a preview of what would happen without actually deploying.

On merge to main: Full pipeline — push image, sign, update GitOps repo, trigger Argo CD sync, run smoke test against the deployed version.


Workflow Structure

Split the pipeline into focused, reusable workflows:

.github/workflows/
├── ci.yml              # PR checks: lint, test, build, scan
├── deploy.yml          # Main branch: push, sign, update GitOps, verify
└── reusable/
    ├── build-image.yml # Reusable: build + scan + push
    └── smoke-test.yml  # Reusable: post-deploy verification

CI Workflow (Pull Requests)

yaml
1# .github/workflows/ci.yml
2name: CI
3
4on:
5  pull_request:
6    branches: [main]
7
8permissions:
9  contents: read
10  pull-requests: write
11  security-events: write   # For uploading Trivy SARIF results
12
13env:
14  IMAGE_NAME: ${{ github.repository }}
15
16jobs:
17  test:
18    name: Test
19    runs-on: ubuntu-latest
20    steps:
21      - uses: actions/checkout@v4
22
23      - uses: actions/setup-node@v4
24        with:
25          node-version: '20'
26          cache: 'npm'
27
28      - name: Install dependencies
29        run: npm ci
30
31      - name: Lint
32        run: npm run lint
33
34      - name: Test
35        run: npm test -- --coverage
36
37      - name: Upload coverage
38        uses: codecov/codecov-action@v4
39        with:
40          token: ${{ secrets.CODECOV_TOKEN }}
41
42  build-and-scan:
43    name: Build and Scan
44    runs-on: ubuntu-latest
45    needs: test
46    steps:
47      - uses: actions/checkout@v4
48
49      - name: Set up Docker Buildx
50        uses: docker/setup-buildx-action@v3
51
52      - name: Build image (no push on PR)
53        uses: docker/build-push-action@v6
54        with:
55          context: .
56          push: false
57          load: true
58          tags: ${{ env.IMAGE_NAME }}:${{ github.sha }}
59          cache-from: type=gha
60          cache-to: type=gha,mode=max
61
62      - name: Scan image with Trivy
63        uses: aquasecurity/trivy-action@0.20.0  # Pin to specific version for supply chain security — never use @master
64        with:
65          image-ref: ${{ env.IMAGE_NAME }}:${{ github.sha }}
66          format: sarif
67          output: trivy-results.sarif
68          severity: CRITICAL,HIGH
69          exit-code: '1'   # Fail on CRITICAL vulnerabilities
70
71      - name: Upload Trivy scan results
72        uses: github/codeql-action/upload-sarif@v3
73        if: always()       # Upload even if scan found vulnerabilities
74        with:
75          sarif_file: trivy-results.sarif

Deploy Workflow (Main Branch)

yaml
1# .github/workflows/deploy.yml
2name: Deploy
3
4on:
5  push:
6    branches: [main]
7
8permissions:
9  contents: read
10  id-token: write    # OIDC token for AWS/GCP auth
11
12env:
13  AWS_REGION: us-east-1
14  ECR_REGISTRY: 123456789.dkr.ecr.us-east-1.amazonaws.com
15  IMAGE_REPO: 123456789.dkr.ecr.us-east-1.amazonaws.com/my-org/api
16  GITOPS_REPO: my-org/gitops-config
17
18jobs:
19  build-push-sign:
20    name: Build, Push, Sign
21    runs-on: ubuntu-latest
22    outputs:
23      image-digest: ${{ steps.build.outputs.digest }}
24      image-tag: ${{ steps.meta.outputs.version }}
25    steps:
26      - uses: actions/checkout@v4
27
28      - name: Configure AWS credentials (OIDC  no long-lived keys)
29        uses: aws-actions/configure-aws-credentials@v4
30        with:
31          role-to-assume: arn:aws:iam::123456789:role/github-actions-ecr
32          aws-region: ${{ env.AWS_REGION }}
33
34      - name: Login to Amazon ECR
35        id: login-ecr
36        uses: aws-actions/amazon-ecr-login@v2
37
38      - name: Docker metadata
39        id: meta
40        uses: docker/metadata-action@v5
41        with:
42          images: ${{ env.IMAGE_REPO }}
43          tags: |
44            type=sha,prefix=,suffix=,format=short
45            type=ref,event=branch
46            type=semver,pattern={{version}}
47
48      - name: Set up Docker Buildx
49        uses: docker/setup-buildx-action@v3
50
51      - name: Build and push
52        id: build
53        uses: docker/build-push-action@v6
54        with:
55          context: .
56          push: true
57          tags: ${{ steps.meta.outputs.tags }}
58          labels: ${{ steps.meta.outputs.labels }}
59          cache-from: type=registry,ref=${{ env.IMAGE_REPO }}:cache
60          cache-to: type=registry,ref=${{ env.IMAGE_REPO }}:cache,mode=max
61          provenance: true    # SLSA provenance attestation
62
63      - name: Scan pushed image
64        uses: aquasecurity/trivy-action@0.20.0  # Pin to specific version for supply chain security — never use @master
65        with:
66          image-ref: ${{ env.IMAGE_REPO }}@${{ steps.build.outputs.digest }}
67          exit-code: '1'
68          severity: CRITICAL
69
70      - name: Install cosign
71        uses: sigstore/cosign-installer@v3
72
73      - name: Install syft (SBOM generator)
74        uses: anchore/sbom-action/download-syft@v0
75
76      - name: Sign image (keyless via OIDC)
77        run: |
78          cosign sign --yes \
79            ${{ env.IMAGE_REPO }}@${{ steps.build.outputs.digest }}
80
81      - name: Generate SBOM and attest
82        run: |
83          # Generate SBOM with syft
84          syft ${{ env.IMAGE_REPO }}@${{ steps.build.outputs.digest }} \
85            -o spdx-json > sbom.json
86          # Attest SBOM to image
87          cosign attest --yes \
88            --predicate sbom.json \
89            --type spdx \
90            ${{ env.IMAGE_REPO }}@${{ steps.build.outputs.digest }}
91
92  update-gitops:
93    name: Update GitOps Config
94    runs-on: ubuntu-latest
95    needs: build-push-sign
96    steps:
97      - name: Checkout GitOps repo
98        uses: actions/checkout@v4
99        with:
100          repository: ${{ env.GITOPS_REPO }}
101          token: ${{ secrets.GITOPS_TOKEN }}   # PAT with repo write access
102          path: gitops
103
104      - name: Update image tag in Helm values
105        run: |
106          cd gitops
107          # Update the image tag using yq
108          yq eval \
109            '.image.tag = "${{ needs.build-push-sign.outputs.image-tag }}"' \
110            -i apps/api/values-production.yaml
111
112          # Or for digest-pinned deployments:
113          yq eval \
114            '.image.digest = "${{ needs.build-push-sign.outputs.image-digest }}"' \
115            -i apps/api/values-production.yaml
116
117      - name: Commit and push
118        run: |
119          cd gitops
120          git config user.name "github-actions[bot]"
121          git config user.email "github-actions[bot]@users.noreply.github.com"
122          git add apps/api/values-production.yaml
123          git commit -m "chore: update api image to ${{ needs.build-push-sign.outputs.image-tag }}
124
125          Deployed by: ${{ github.actor }}
126          Commit: ${{ github.sha }}
127          Workflow: ${{ github.run_id }}"
128          git push
129
130  wait-and-verify:
131    name: Wait for Sync and Verify
132    runs-on: ubuntu-latest
133    needs: update-gitops
134    steps:
135      - name: Configure AWS credentials
136        uses: aws-actions/configure-aws-credentials@v4
137        with:
138          role-to-assume: arn:aws:iam::123456789:role/github-actions-eks
139          aws-region: ${{ env.AWS_REGION }}
140
141      - name: Update kubeconfig
142        run: |
143          aws eks update-kubeconfig \
144            --name my-cluster \
145            --region ${{ env.AWS_REGION }}
146
147      - name: Wait for Argo CD sync
148        run: |
149          # Install argocd CLI
150          curl -sSL -o /usr/local/bin/argocd \
151            https://github.com/argoproj/argo-cd/releases/latest/download/argocd-linux-amd64
152          chmod +x /usr/local/bin/argocd
153
154          # Login to Argo CD
155          argocd login ${{ secrets.ARGOCD_SERVER }} \
156            --auth-token ${{ secrets.ARGOCD_TOKEN }} \
157            --grpc-web
158
159          # Wait for sync (timeout 5 minutes)
160          argocd app wait api \
161            --sync \
162            --health \
163            --timeout 300
164
165      - name: Run smoke tests
166        run: |
167          # Wait for rollout to complete
168          kubectl rollout status deployment/api -n production --timeout=300s
169
170          # Basic smoke test
171          API_URL=$(kubectl get ingress api -n production \
172            -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')
173
174          # Health check
175          curl -f https://$API_URL/healthz || exit 1
176
177          # Version check — verify the new image is running
178          DEPLOYED_VERSION=$(curl -s https://$API_URL/version | jq -r '.version')
179          if [ "$DEPLOYED_VERSION" != "${{ needs.build-push-sign.outputs.image-tag }}" ]; then
180            echo "Version mismatch: expected ${{ needs.build-push-sign.outputs.image-tag }}, got $DEPLOYED_VERSION"
181            exit 1
182          fi
183          echo "Deployment verified: $DEPLOYED_VERSION"

AWS Authentication: OIDC Instead of Long-Lived Keys

Never store AWS access keys in GitHub Secrets. Use OIDC federation — GitHub Actions requests a short-lived token from GitHub's OIDC provider, which AWS trusts:

hcl
1# Terraform: set up OIDC trust for GitHub Actions
2resource "aws_iam_openid_connect_provider" "github" {
3  url             = "https://token.actions.githubusercontent.com"
4  client_id_list  = ["sts.amazonaws.com"]
5  thumbprint_list = []   # AWS validates GitHub's OIDC provider automatically; thumbprint not required
6}
7
8resource "aws_iam_role" "github_actions_ecr" {
9  name = "github-actions-ecr"
10  assume_role_policy = jsonencode({
11    Version = "2012-10-17"
12    Statement = [{
13      Effect = "Allow"
14      Principal = {
15        Federated = aws_iam_openid_connect_provider.github.arn
16      }
17      Action = "sts:AssumeRoleWithWebIdentity"
18      Condition = {
19        StringEquals = {
20          "token.actions.githubusercontent.com:aud" = "sts.amazonaws.com"
21        }
22        StringLike = {
23          # Only allow from your specific repo and main branch
24          "token.actions.githubusercontent.com:sub" = [
25            "repo:my-org/my-repo:ref:refs/heads/main"
26          ]
27        }
28      }
29    }]
30  })
31}
32
33resource "aws_iam_role_policy_attachment" "ecr_push" {
34  role       = aws_iam_role.github_actions_ecr.name
35  policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryPowerUser"
36}

The sub condition restricts the role to only your repository and main branch — a compromised workflow on a fork can't assume this role.


Reusable Workflows

Extract common steps into reusable workflows to avoid duplication across repositories:

yaml
1# .github/workflows/reusable/build-image.yml  (in platform-team/workflows repo)
2name: Build and Push Image
3
4on:
5  workflow_call:
6    inputs:
7      image-repo:
8        required: true
9        type: string
10      context:
11        required: false
12        type: string
13        default: '.'
14    outputs:
15      image-digest:
16        value: ${{ jobs.build.outputs.digest }}
17      image-tag:
18        value: ${{ jobs.build.outputs.tag }}
19    secrets:
20      aws-role-arn:
21        required: true
22
23jobs:
24  build:
25    runs-on: ubuntu-latest
26    outputs:
27      digest: ${{ steps.build.outputs.digest }}
28      tag: ${{ steps.meta.outputs.version }}
29    steps:
30      # ... full build/scan/sign steps

Each service calls the reusable workflow:

yaml
1# In each service's .github/workflows/deploy.yml
2jobs:
3  build:
4    uses: platform-team/workflows/.github/workflows/reusable/build-image.yml@main
5    with:
6      image-repo: 123456789.dkr.ecr.us-east-1.amazonaws.com/my-org/${{ github.event.repository.name }}
7    secrets:
8      aws-role-arn: ${{ secrets.AWS_ECR_ROLE_ARN }}

When the platform team improves the build workflow (faster caching, better scanning, new attestations), all service pipelines inherit the improvement automatically — no per-repo changes.

Security note: Reference reusable workflows by SHA for production pipelines, same as third-party actions:

yaml
uses: platform-team/workflows/.github/workflows/reusable/build-image.yml@a1b2c3d4e5f6

Secrets Management in Pipelines

What goes in GitHub Secrets:

  • GITOPS_TOKEN — PAT for writing to the GitOps repo
  • ARGOCD_TOKEN — Argo CD API token for sync/wait
  • CODECOV_TOKEN — coverage reporting

What does NOT go in GitHub Secrets:

  • AWS credentials (use OIDC)
  • GCP credentials (use Workload Identity Federation)
  • Container registry credentials (use OIDC-based registry auth)
  • Kubernetes credentials (use OIDC-assumed IAM role → aws eks update-kubeconfig)

The pattern: use federated OIDC for cloud credentials, reserve GitHub Secrets for service tokens that don't support OIDC.

For runtime secrets (database passwords, API keys your application uses), use AWS Secrets Manager or SSM Parameter Store + External Secrets Operator. Never pass them through GitHub Actions environment variables — they'll appear in workflow logs if any step prints the environment.


Caching Strategy

Effective caching dramatically reduces build times:

yaml
1- name: Set up Docker Buildx
2  uses: docker/setup-buildx-action@v3
3
4- name: Build with layer cache
5  uses: docker/build-push-action@v6
6  with:
7    cache-from: |
8      type=gha                                    # GitHub Actions cache (per-branch)
9      type=registry,ref=${{ env.IMAGE_REPO }}:cache  # Registry cache (cross-branch)
10    cache-to: type=gha,mode=max                   # Write to GHA cache

GitHub Actions cache (type=gha) is fast and free but isolated per branch. A PR branch gets no benefit from the cache built on main.

Registry cache (type=registry) stores cache manifests in the registry alongside your image. All branches share this cache. More expensive (registry storage costs) but much better hit rate for PRs.

For monorepos with multiple services, use separate cache keys per service:

yaml
cache-from: type=registry,ref=${{ env.IMAGE_REPO }}/service-a:cache
cache-to: type=registry,ref=${{ env.IMAGE_REPO }}/service-a:cache,mode=max

Environment Promotion

For staging → production promotion:

yaml
1# Separate deploy jobs per environment
2jobs:
3  deploy-staging:
4    environment: staging
5    steps:
6      - name: Update staging values
7        run: |
8          yq eval '.image.tag = "${{ env.TAG }}"' \
9            -i apps/api/values-staging.yaml
10
11  deploy-production:
12    environment: production   # Requires manual approval via GitHub environment protection
13    needs: [deploy-staging, integration-tests]
14    steps:
15      - name: Update production values
16        run: |
17          yq eval '.image.tag = "${{ env.TAG }}"' \
18            -i apps/api/values-production.yaml

GitHub Environments with required reviewers creates a manual approval gate before the production deploy step runs. The deployment is paused, the reviewer is notified, and they approve or reject.


Monitoring the Pipeline Itself

yaml
1- name: Notify Slack on failure
2  if: failure()
3  uses: slackapi/slack-github-action@v1
4  with:
5    payload: |
6      {
7        "text": "❌ Deploy failed for *${{ github.repository }}*",
8        "blocks": [
9          {
10            "type": "section",
11            "text": {
12              "type": "mrkdwn",
13              "text": "*Deploy failed*\n*Repo:* ${{ github.repository }}\n*Branch:* ${{ github.ref_name }}\n*Commit:* ${{ github.sha }}\n*Actor:* ${{ github.actor }}\n*Run:* ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}"
14            }
15          }
16        ]
17      }
18  env:
19    SLACK_WEBHOOK_URL: ${{ secrets.SLACK_DEPLOY_WEBHOOK }}

Also track pipeline metrics in Grafana — GitHub Actions exposes a REST API for workflow run history. A dashboard showing deploy frequency, success rate, and lead time gives you DORA metrics without additional tooling.


Frequently Asked Questions

How do I handle database migrations in this pipeline?

Run migrations as a Kubernetes Job triggered by the pipeline, after the new image is pushed but before the Deployment is updated. The migration Job uses the new image version and runs with a migration-specific service account:

yaml
1- name: Run migrations
2  run: |
3    kubectl create job migration-${{ github.sha }} \
4      --from=cronjob/migration-template \
5      --namespace production
6    kubectl wait job/migration-${{ github.sha }} \
7      --for=condition=complete \
8      --timeout=300s \
9      --namespace production

Make all migrations backward-compatible — the old application version must handle the new schema while the migration runs and before the new Deployment rolls out.

Should I push the image before or after updating the GitOps repo?

Push first, then update GitOps. If you update the GitOps repo first, Argo CD tries to sync and pulls an image tag that doesn't exist yet — causing a brief period of failing pods. Always push image → verify push succeeded → update GitOps.

How do I roll back a failed deployment?

Since the GitOps repo is the source of truth, rollback is a git revert:

bash
# Revert the image tag update commit
git revert HEAD --no-edit
git push

# Argo CD detects the change and syncs back to the previous image

Or trigger via the Argo CD UI/CLI:

bash
argocd app rollback api <previous-sync-id>

How do I preview environments for pull requests?

Use Argo CD ApplicationSet with a PR generator:

yaml
1apiVersion: argoproj.io/v1alpha1
2kind: ApplicationSet
3spec:
4  generators:
5    - pullRequest:
6        github:
7          owner: my-org
8          repo: my-app
9          labels: [preview]   # Only PRs with this label get preview envs
10  template:
11    spec:
12      destination:
13        namespace: preview-{{number}}

Each labelled PR gets a namespace with the PR's image deployed. The ApplicationSet removes the namespace when the PR is closed. For a deeper look at all ApplicationSet generators and multi-cluster patterns, see Argo CD ApplicationSet: Multi-Cluster Deployment and Generator Patterns.


For GitOps setup that this pipeline writes to, see GitOps with Argo CD: Production Setup Guide. For supply chain security tooling (Sigstore, SLSA, OPA admission, Kyverno image policies), see Supply Chain Security Tools for Kubernetes.

Building a Kubernetes CI/CD pipeline from scratch? Talk to us at Coding Protocols — we help platform teams design pipelines that are secure by default and fast enough that developers actually use them.

Related Topics

GitHub Actions
CI/CD
Kubernetes
Docker
Helm
GitOps
DevSecOps
Platform Engineering
EKS
Security

Read Next