GitHub Actions CI/CD for Kubernetes: End-to-End Pipeline Guide
A Kubernetes CI/CD pipeline has more moving parts than a traditional deploy: image build, vulnerability scan, image sign, Helm chart update, GitOps sync, and smoke test — all chained together with the right secrets, caching, and failure handling. Here's the complete pipeline.

A naive Kubernetes CI/CD pipeline builds a Docker image, pushes it to a registry, and runs kubectl set image. This works, but it bypasses every safety control that matters in production: vulnerability scanning, image signing, GitOps reconciliation, and staged rollout.
A production pipeline is a chain: test → build → scan → sign → push → update GitOps repo → sync → verify. Each stage has failure conditions that should block the next stage. Getting this chain right requires understanding each component's role and the common failure modes at each step.
Pipeline Architecture
PR → Test → Build image → Scan (Trivy) → Push to ECR
↓
main → same chain → Sign (cosign) → Update Helm values → Argo CD sync → Smoke test
On pull requests: Run tests + build + scan (no push, no deploy). PRs get a preview of what would happen without actually deploying.
On merge to main: Full pipeline — push image, sign, update GitOps repo, trigger Argo CD sync, run smoke test against the deployed version.
Workflow Structure
Split the pipeline into focused, reusable workflows:
.github/workflows/
├── ci.yml # PR checks: lint, test, build, scan
├── deploy.yml # Main branch: push, sign, update GitOps, verify
└── reusable/
├── build-image.yml # Reusable: build + scan + push
└── smoke-test.yml # Reusable: post-deploy verification
CI Workflow (Pull Requests)
1# .github/workflows/ci.yml
2name: CI
3
4on:
5 pull_request:
6 branches: [main]
7
8permissions:
9 contents: read
10 pull-requests: write
11 security-events: write # For uploading Trivy SARIF results
12
13env:
14 IMAGE_NAME: ${{ github.repository }}
15
16jobs:
17 test:
18 name: Test
19 runs-on: ubuntu-latest
20 steps:
21 - uses: actions/checkout@v4
22
23 - uses: actions/setup-node@v4
24 with:
25 node-version: '20'
26 cache: 'npm'
27
28 - name: Install dependencies
29 run: npm ci
30
31 - name: Lint
32 run: npm run lint
33
34 - name: Test
35 run: npm test -- --coverage
36
37 - name: Upload coverage
38 uses: codecov/codecov-action@v4
39 with:
40 token: ${{ secrets.CODECOV_TOKEN }}
41
42 build-and-scan:
43 name: Build and Scan
44 runs-on: ubuntu-latest
45 needs: test
46 steps:
47 - uses: actions/checkout@v4
48
49 - name: Set up Docker Buildx
50 uses: docker/setup-buildx-action@v3
51
52 - name: Build image (no push on PR)
53 uses: docker/build-push-action@v6
54 with:
55 context: .
56 push: false
57 load: true
58 tags: ${{ env.IMAGE_NAME }}:${{ github.sha }}
59 cache-from: type=gha
60 cache-to: type=gha,mode=max
61
62 - name: Scan image with Trivy
63 uses: aquasecurity/trivy-action@0.20.0 # Pin to specific version for supply chain security — never use @master
64 with:
65 image-ref: ${{ env.IMAGE_NAME }}:${{ github.sha }}
66 format: sarif
67 output: trivy-results.sarif
68 severity: CRITICAL,HIGH
69 exit-code: '1' # Fail on CRITICAL vulnerabilities
70
71 - name: Upload Trivy scan results
72 uses: github/codeql-action/upload-sarif@v3
73 if: always() # Upload even if scan found vulnerabilities
74 with:
75 sarif_file: trivy-results.sarifDeploy Workflow (Main Branch)
1# .github/workflows/deploy.yml
2name: Deploy
3
4on:
5 push:
6 branches: [main]
7
8permissions:
9 contents: read
10 id-token: write # OIDC token for AWS/GCP auth
11
12env:
13 AWS_REGION: us-east-1
14 ECR_REGISTRY: 123456789.dkr.ecr.us-east-1.amazonaws.com
15 IMAGE_REPO: 123456789.dkr.ecr.us-east-1.amazonaws.com/my-org/api
16 GITOPS_REPO: my-org/gitops-config
17
18jobs:
19 build-push-sign:
20 name: Build, Push, Sign
21 runs-on: ubuntu-latest
22 outputs:
23 image-digest: ${{ steps.build.outputs.digest }}
24 image-tag: ${{ steps.meta.outputs.version }}
25 steps:
26 - uses: actions/checkout@v4
27
28 - name: Configure AWS credentials (OIDC — no long-lived keys)
29 uses: aws-actions/configure-aws-credentials@v4
30 with:
31 role-to-assume: arn:aws:iam::123456789:role/github-actions-ecr
32 aws-region: ${{ env.AWS_REGION }}
33
34 - name: Login to Amazon ECR
35 id: login-ecr
36 uses: aws-actions/amazon-ecr-login@v2
37
38 - name: Docker metadata
39 id: meta
40 uses: docker/metadata-action@v5
41 with:
42 images: ${{ env.IMAGE_REPO }}
43 tags: |
44 type=sha,prefix=,suffix=,format=short
45 type=ref,event=branch
46 type=semver,pattern={{version}}
47
48 - name: Set up Docker Buildx
49 uses: docker/setup-buildx-action@v3
50
51 - name: Build and push
52 id: build
53 uses: docker/build-push-action@v6
54 with:
55 context: .
56 push: true
57 tags: ${{ steps.meta.outputs.tags }}
58 labels: ${{ steps.meta.outputs.labels }}
59 cache-from: type=registry,ref=${{ env.IMAGE_REPO }}:cache
60 cache-to: type=registry,ref=${{ env.IMAGE_REPO }}:cache,mode=max
61 provenance: true # SLSA provenance attestation
62
63 - name: Scan pushed image
64 uses: aquasecurity/trivy-action@0.20.0 # Pin to specific version for supply chain security — never use @master
65 with:
66 image-ref: ${{ env.IMAGE_REPO }}@${{ steps.build.outputs.digest }}
67 exit-code: '1'
68 severity: CRITICAL
69
70 - name: Install cosign
71 uses: sigstore/cosign-installer@v3
72
73 - name: Install syft (SBOM generator)
74 uses: anchore/sbom-action/download-syft@v0
75
76 - name: Sign image (keyless via OIDC)
77 run: |
78 cosign sign --yes \
79 ${{ env.IMAGE_REPO }}@${{ steps.build.outputs.digest }}
80
81 - name: Generate SBOM and attest
82 run: |
83 # Generate SBOM with syft
84 syft ${{ env.IMAGE_REPO }}@${{ steps.build.outputs.digest }} \
85 -o spdx-json > sbom.json
86 # Attest SBOM to image
87 cosign attest --yes \
88 --predicate sbom.json \
89 --type spdx \
90 ${{ env.IMAGE_REPO }}@${{ steps.build.outputs.digest }}
91
92 update-gitops:
93 name: Update GitOps Config
94 runs-on: ubuntu-latest
95 needs: build-push-sign
96 steps:
97 - name: Checkout GitOps repo
98 uses: actions/checkout@v4
99 with:
100 repository: ${{ env.GITOPS_REPO }}
101 token: ${{ secrets.GITOPS_TOKEN }} # PAT with repo write access
102 path: gitops
103
104 - name: Update image tag in Helm values
105 run: |
106 cd gitops
107 # Update the image tag using yq
108 yq eval \
109 '.image.tag = "${{ needs.build-push-sign.outputs.image-tag }}"' \
110 -i apps/api/values-production.yaml
111
112 # Or for digest-pinned deployments:
113 yq eval \
114 '.image.digest = "${{ needs.build-push-sign.outputs.image-digest }}"' \
115 -i apps/api/values-production.yaml
116
117 - name: Commit and push
118 run: |
119 cd gitops
120 git config user.name "github-actions[bot]"
121 git config user.email "github-actions[bot]@users.noreply.github.com"
122 git add apps/api/values-production.yaml
123 git commit -m "chore: update api image to ${{ needs.build-push-sign.outputs.image-tag }}
124
125 Deployed by: ${{ github.actor }}
126 Commit: ${{ github.sha }}
127 Workflow: ${{ github.run_id }}"
128 git push
129
130 wait-and-verify:
131 name: Wait for Sync and Verify
132 runs-on: ubuntu-latest
133 needs: update-gitops
134 steps:
135 - name: Configure AWS credentials
136 uses: aws-actions/configure-aws-credentials@v4
137 with:
138 role-to-assume: arn:aws:iam::123456789:role/github-actions-eks
139 aws-region: ${{ env.AWS_REGION }}
140
141 - name: Update kubeconfig
142 run: |
143 aws eks update-kubeconfig \
144 --name my-cluster \
145 --region ${{ env.AWS_REGION }}
146
147 - name: Wait for Argo CD sync
148 run: |
149 # Install argocd CLI
150 curl -sSL -o /usr/local/bin/argocd \
151 https://github.com/argoproj/argo-cd/releases/latest/download/argocd-linux-amd64
152 chmod +x /usr/local/bin/argocd
153
154 # Login to Argo CD
155 argocd login ${{ secrets.ARGOCD_SERVER }} \
156 --auth-token ${{ secrets.ARGOCD_TOKEN }} \
157 --grpc-web
158
159 # Wait for sync (timeout 5 minutes)
160 argocd app wait api \
161 --sync \
162 --health \
163 --timeout 300
164
165 - name: Run smoke tests
166 run: |
167 # Wait for rollout to complete
168 kubectl rollout status deployment/api -n production --timeout=300s
169
170 # Basic smoke test
171 API_URL=$(kubectl get ingress api -n production \
172 -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')
173
174 # Health check
175 curl -f https://$API_URL/healthz || exit 1
176
177 # Version check — verify the new image is running
178 DEPLOYED_VERSION=$(curl -s https://$API_URL/version | jq -r '.version')
179 if [ "$DEPLOYED_VERSION" != "${{ needs.build-push-sign.outputs.image-tag }}" ]; then
180 echo "Version mismatch: expected ${{ needs.build-push-sign.outputs.image-tag }}, got $DEPLOYED_VERSION"
181 exit 1
182 fi
183 echo "Deployment verified: $DEPLOYED_VERSION"AWS Authentication: OIDC Instead of Long-Lived Keys
Never store AWS access keys in GitHub Secrets. Use OIDC federation — GitHub Actions requests a short-lived token from GitHub's OIDC provider, which AWS trusts:
1# Terraform: set up OIDC trust for GitHub Actions
2resource "aws_iam_openid_connect_provider" "github" {
3 url = "https://token.actions.githubusercontent.com"
4 client_id_list = ["sts.amazonaws.com"]
5 thumbprint_list = [] # AWS validates GitHub's OIDC provider automatically; thumbprint not required
6}
7
8resource "aws_iam_role" "github_actions_ecr" {
9 name = "github-actions-ecr"
10 assume_role_policy = jsonencode({
11 Version = "2012-10-17"
12 Statement = [{
13 Effect = "Allow"
14 Principal = {
15 Federated = aws_iam_openid_connect_provider.github.arn
16 }
17 Action = "sts:AssumeRoleWithWebIdentity"
18 Condition = {
19 StringEquals = {
20 "token.actions.githubusercontent.com:aud" = "sts.amazonaws.com"
21 }
22 StringLike = {
23 # Only allow from your specific repo and main branch
24 "token.actions.githubusercontent.com:sub" = [
25 "repo:my-org/my-repo:ref:refs/heads/main"
26 ]
27 }
28 }
29 }]
30 })
31}
32
33resource "aws_iam_role_policy_attachment" "ecr_push" {
34 role = aws_iam_role.github_actions_ecr.name
35 policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryPowerUser"
36}The sub condition restricts the role to only your repository and main branch — a compromised workflow on a fork can't assume this role.
Reusable Workflows
Extract common steps into reusable workflows to avoid duplication across repositories:
1# .github/workflows/reusable/build-image.yml (in platform-team/workflows repo)
2name: Build and Push Image
3
4on:
5 workflow_call:
6 inputs:
7 image-repo:
8 required: true
9 type: string
10 context:
11 required: false
12 type: string
13 default: '.'
14 outputs:
15 image-digest:
16 value: ${{ jobs.build.outputs.digest }}
17 image-tag:
18 value: ${{ jobs.build.outputs.tag }}
19 secrets:
20 aws-role-arn:
21 required: true
22
23jobs:
24 build:
25 runs-on: ubuntu-latest
26 outputs:
27 digest: ${{ steps.build.outputs.digest }}
28 tag: ${{ steps.meta.outputs.version }}
29 steps:
30 # ... full build/scan/sign stepsEach service calls the reusable workflow:
1# In each service's .github/workflows/deploy.yml
2jobs:
3 build:
4 uses: platform-team/workflows/.github/workflows/reusable/build-image.yml@main
5 with:
6 image-repo: 123456789.dkr.ecr.us-east-1.amazonaws.com/my-org/${{ github.event.repository.name }}
7 secrets:
8 aws-role-arn: ${{ secrets.AWS_ECR_ROLE_ARN }}When the platform team improves the build workflow (faster caching, better scanning, new attestations), all service pipelines inherit the improvement automatically — no per-repo changes.
Security note: Reference reusable workflows by SHA for production pipelines, same as third-party actions:
uses: platform-team/workflows/.github/workflows/reusable/build-image.yml@a1b2c3d4e5f6Secrets Management in Pipelines
What goes in GitHub Secrets:
GITOPS_TOKEN— PAT for writing to the GitOps repoARGOCD_TOKEN— Argo CD API token for sync/waitCODECOV_TOKEN— coverage reporting
What does NOT go in GitHub Secrets:
- AWS credentials (use OIDC)
- GCP credentials (use Workload Identity Federation)
- Container registry credentials (use OIDC-based registry auth)
- Kubernetes credentials (use OIDC-assumed IAM role → aws eks update-kubeconfig)
The pattern: use federated OIDC for cloud credentials, reserve GitHub Secrets for service tokens that don't support OIDC.
For runtime secrets (database passwords, API keys your application uses), use AWS Secrets Manager or SSM Parameter Store + External Secrets Operator. Never pass them through GitHub Actions environment variables — they'll appear in workflow logs if any step prints the environment.
Caching Strategy
Effective caching dramatically reduces build times:
1- name: Set up Docker Buildx
2 uses: docker/setup-buildx-action@v3
3
4- name: Build with layer cache
5 uses: docker/build-push-action@v6
6 with:
7 cache-from: |
8 type=gha # GitHub Actions cache (per-branch)
9 type=registry,ref=${{ env.IMAGE_REPO }}:cache # Registry cache (cross-branch)
10 cache-to: type=gha,mode=max # Write to GHA cacheGitHub Actions cache (type=gha) is fast and free but isolated per branch. A PR branch gets no benefit from the cache built on main.
Registry cache (type=registry) stores cache manifests in the registry alongside your image. All branches share this cache. More expensive (registry storage costs) but much better hit rate for PRs.
For monorepos with multiple services, use separate cache keys per service:
cache-from: type=registry,ref=${{ env.IMAGE_REPO }}/service-a:cache
cache-to: type=registry,ref=${{ env.IMAGE_REPO }}/service-a:cache,mode=maxEnvironment Promotion
For staging → production promotion:
1# Separate deploy jobs per environment
2jobs:
3 deploy-staging:
4 environment: staging
5 steps:
6 - name: Update staging values
7 run: |
8 yq eval '.image.tag = "${{ env.TAG }}"' \
9 -i apps/api/values-staging.yaml
10
11 deploy-production:
12 environment: production # Requires manual approval via GitHub environment protection
13 needs: [deploy-staging, integration-tests]
14 steps:
15 - name: Update production values
16 run: |
17 yq eval '.image.tag = "${{ env.TAG }}"' \
18 -i apps/api/values-production.yamlGitHub Environments with required reviewers creates a manual approval gate before the production deploy step runs. The deployment is paused, the reviewer is notified, and they approve or reject.
Monitoring the Pipeline Itself
1- name: Notify Slack on failure
2 if: failure()
3 uses: slackapi/slack-github-action@v1
4 with:
5 payload: |
6 {
7 "text": "❌ Deploy failed for *${{ github.repository }}*",
8 "blocks": [
9 {
10 "type": "section",
11 "text": {
12 "type": "mrkdwn",
13 "text": "*Deploy failed*\n*Repo:* ${{ github.repository }}\n*Branch:* ${{ github.ref_name }}\n*Commit:* ${{ github.sha }}\n*Actor:* ${{ github.actor }}\n*Run:* ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}"
14 }
15 }
16 ]
17 }
18 env:
19 SLACK_WEBHOOK_URL: ${{ secrets.SLACK_DEPLOY_WEBHOOK }}Also track pipeline metrics in Grafana — GitHub Actions exposes a REST API for workflow run history. A dashboard showing deploy frequency, success rate, and lead time gives you DORA metrics without additional tooling.
Frequently Asked Questions
How do I handle database migrations in this pipeline?
Run migrations as a Kubernetes Job triggered by the pipeline, after the new image is pushed but before the Deployment is updated. The migration Job uses the new image version and runs with a migration-specific service account:
1- name: Run migrations
2 run: |
3 kubectl create job migration-${{ github.sha }} \
4 --from=cronjob/migration-template \
5 --namespace production
6 kubectl wait job/migration-${{ github.sha }} \
7 --for=condition=complete \
8 --timeout=300s \
9 --namespace productionMake all migrations backward-compatible — the old application version must handle the new schema while the migration runs and before the new Deployment rolls out.
Should I push the image before or after updating the GitOps repo?
Push first, then update GitOps. If you update the GitOps repo first, Argo CD tries to sync and pulls an image tag that doesn't exist yet — causing a brief period of failing pods. Always push image → verify push succeeded → update GitOps.
How do I roll back a failed deployment?
Since the GitOps repo is the source of truth, rollback is a git revert:
# Revert the image tag update commit
git revert HEAD --no-edit
git push
# Argo CD detects the change and syncs back to the previous imageOr trigger via the Argo CD UI/CLI:
argocd app rollback api <previous-sync-id>How do I preview environments for pull requests?
Use Argo CD ApplicationSet with a PR generator:
1apiVersion: argoproj.io/v1alpha1
2kind: ApplicationSet
3spec:
4 generators:
5 - pullRequest:
6 github:
7 owner: my-org
8 repo: my-app
9 labels: [preview] # Only PRs with this label get preview envs
10 template:
11 spec:
12 destination:
13 namespace: preview-{{number}}Each labelled PR gets a namespace with the PR's image deployed. The ApplicationSet removes the namespace when the PR is closed. For a deeper look at all ApplicationSet generators and multi-cluster patterns, see Argo CD ApplicationSet: Multi-Cluster Deployment and Generator Patterns.
For GitOps setup that this pipeline writes to, see GitOps with Argo CD: Production Setup Guide. For supply chain security tooling (Sigstore, SLSA, OPA admission, Kyverno image policies), see Supply Chain Security Tools for Kubernetes.
Building a Kubernetes CI/CD pipeline from scratch? Talk to us at Coding Protocols — we help platform teams design pipelines that are secure by default and fast enough that developers actually use them.


