Skip to content

CI Workflow

Reference documentation for the GitHub Actions CI workflow.

Repeated E2E logic is factored into reusable shell scripts (hack/ci-*.sh) and a composite GitHub Action (.github/actions/setup-e2e-infra/), reducing duplication across the e2e-infra, e2e-operator, and tempest jobs.

The build-e2e-images job centralises E2E image builds: it builds all Docker images (operator, service, tempest) once and pushes them to GHCR with run-scoped tags (e2e-${run_id}-<orig_tag>). The e2e-operator, e2e-chaos, and tempest jobs docker pull from GHCR via the load-e2e-images composite action and re-tag the images to their canonical local references, saving ~5-10 min per CI run versus rebuilding. The build always includes keystone (required by tempest) regardless of which operator triggered the pipeline. The cleanup-e2e-tags job prunes the run-scoped tags at the end of the workflow, with a nightly safety net in cleanup-images.yaml for cancelled runs (GH-310).

File Location

.github/workflows/ci.yaml

The file uses the .yaml extension (matching reuse.yaml and deploy-docs.yaml) and quotes the trigger key as "on" to prevent YAML boolean interpretation.

Trigger Events

The workflow triggers on three event types:

EventScopeDescription
pushbranches: [main]Runs on every push to the main branch
pushtags: ["v*"]Runs on every v-prefixed tag push (triggers publish and release jobs)
pull_requestbranches: [main], types: [opened, synchronize, reopened, labeled]Runs on every pull request targeting main; includes labeled type to support on-demand chaos via run-chaos label

Tag pushes (v*) enable the full release pipeline: gate jobs, E2E tests, image/chart publishing, and GitHub Release creation. Pull requests and main-branch pushes run only gate and E2E jobs (publish jobs are skipped via if conditions).

Environment Variables

Top-level environment variables centralise registry configuration and pin tool versions for CI reproducibility:

yaml
env:
  REGISTRY: ghcr.io
  IMAGE_PREFIX: ghcr.io/c5c3
  CONTROLLER_GEN_VERSION: v0.20.1
  GOFUMPT_VERSION: v0.9.2

REGISTRY and IMAGE_PREFIX are referenced by the build-and-push, helm-push, e2e-operator, and tempest jobs to construct image names and registry URLs. CONTROLLER_GEN_VERSION is used by verify-codegen to pin controller-gen to a specific version. GOFUMPT_VERSION is used by format-check to pin gofumpt to a specific version; the same version is mirrored in the Makefile (GOFUMPT_VERSION ?= v0.9.2) so that make fmt and make format-check use a consistent version locally. setup-envtest is installed via @release-0.23 because the sub-module does not publish its own release tags.

Permissions

Top-level permissions are restricted to least privilege:

yaml
permissions:
  contents: read

Jobs that need elevated access declare per-job permissions: blocks:

JobAdditional PermissionsReason
build-and-pushpackages: writePush per-platform operator image digests to GHCR
merge-operator-imagespackages: writePush final multi-arch operator image manifest list
helm-pushpackages: writePush Helm charts to GHCR OCI registry
github-releasecontents: writeCreate GitHub Releases

Job Dependency DAG

The workflow defines 24 jobs organised in a directed acyclic graph:

Gate Jobs (always run):
  lint ────────────────────────┐
  format-check ────────────────┤
  shellcheck ──────────────────┤
  verify-codegen ──────────────┤
  verify-invalid-cr-fixtures ──┤
  chainsaw-lint ───────────────┤
  test (matrix) ───────────────┼──> build-e2e-images ──> E2E Jobs
  test-integration ────────────┘

Conditional Jobs (path-filtered via changes job):
  test-race ────> needs: [changes], if: needs.changes.outputs.go == 'true'
  govulncheck ─> needs: [changes], if: needs.changes.outputs.go == 'true'
  helm-validate ──> needs: [changes], if: needs.changes.outputs.helm == 'true'
  docs ──────────> needs: [changes], if: needs.changes.outputs.docs == 'true'

Image Build (depends on gates):
  build-e2e-images ──> needs: [changes, lint, shellcheck, test, test-integration, verify-codegen, verify-invalid-cr-fixtures, chainsaw-lint]

E2E Jobs (depends on build-e2e-images):
  e2e-infra ──────> needs: [changes], if: needs.changes.outputs.e2e-infra == 'true'
  e2e-operator ───> needs: [changes, build-e2e-images]
  e2e-chaos ──────> needs: [changes, lint, shellcheck, test, test-integration, verify-codegen, chainsaw-lint, build-e2e-images]
  e2e-prometheus ─> needs: [changes, lint, shellcheck, test, test-integration, verify-codegen, chainsaw-lint, build-e2e-images]
                     if: needs.changes.outputs.e2e-prometheus == 'true'
  tempest ────────> needs: [changes, build-e2e-images]

Publish Jobs (main/tags only, depends on E2E):
  build-and-push (matrix: operator × platform) ──> needs: [changes, e2e-operator], if: push event
    └──> merge-operator-images ──> needs: [changes, build-and-push], if: push event
  helm-push ──> needs: [changes, e2e-operator], if: push event

Release Job (v* tags only, depends on publish):
  github-release ──> needs: [changes, merge-operator-images, helm-push], if: v* tag

The five E2E jobs (e2e-infra, e2e-operator, e2e-chaos, e2e-prometheus, tempest) share infrastructure setup via the setup-e2e-infra composite action and diagnostic teardown via hack/ci-dump-diagnostics.sh.

Jobs

lint

Runs golangci-lint using the project's .golangci.yml configuration.

StepActionDetails
1actions/checkout@v6Checks out the repository (SHA-pinned)
2actions/setup-go@v6Sets up Go with go-version-file: go.work
3golangci/golangci-lint-action@v9Installs golangci-lint binary (install-only: true); version pinned to v2.11.4
4make lintRuns golangci-lint per module via the Makefile

The golangci-lint-action@v9 step is used with install-only: true, which installs the pinned golangci-lint binary (and caches it) without running lint. The actual linting is delegated to make lint, which cds into each module directory and runs golangci-lint run ./... — a necessary pattern for Go multi-module workspaces. The actions/setup-go@v6 step is required because install-only mode does not set up Go internally.

Enabled linters (12 total, configured in .golangci.yml):

LinterCategoryDescription
errcheckcorrectnessChecks for unchecked errors in Go code
gocriticstyleProvides diagnostics for bugs, performance, and style issues
govetcorrectnessReports suspicious constructs, roughly equivalent to go vet
ineffassigncorrectnessDetects assignments to existing variables that are never used
staticcheckcorrectnessComprehensive static analysis rules from the staticcheck suite
unusedcorrectnessChecks for unused constants, variables, functions, and types
bodycloseresource-leakChecks whether HTTP response bodies are closed successfully
errorlintcorrectnessValidates Go 1.13+ error wrapping patterns (%w, errors.Is, errors.As)
exhaustivecorrectnessChecks exhaustiveness of enum switch statements
gosecsecurityInspects source code for security problems (hardcoded credentials, weak crypto, unsafe operations)
nilerrcorrectnessFinds code that returns nil even after checking that an error is not nil
noctxcorrectnessDetects HTTP requests and TLS dials missing context.Context propagation

Generated code matching zz_generated.*.go is excluded from all lint checks via the exclusions.paths configuration.

format-check

Verifies all Go files conform to gofumpt formatting. gofumpt is a strict superset of gofmt — it applies all standard gofmt rules plus additional formatting conventions for consistency. Detects non-conforming files and prints a unified diff showing the required changes, so developers can identify and fix formatting issues without guessing.

Only git-tracked Go files are checked (git ls-files '*.go') to avoid unexpected failures on generated, vendored, or tooling code that may not follow gofumpt conventions.

The same version and check logic are available locally via the Makefile: make install-gofumpt installs the pinned version, make format-check mirrors the CI check, and make fmt applies formatting to all tracked Go files. The Makefile targets use xargs without the -r flag (unlike CI) for macOS portability — BSD xargs does not support -r. This is safe because the repository always contains tracked .go files.

Dependencies: needs: [changes]Condition: if: needs.changes.outputs.go == 'true'

StepActionDetails
1actions/checkout@v6Checks out the repository (SHA-pinned)
2actions/setup-go@v6Sets up Go with go-version-file: go.work
3go install mvdan.cc/gofumpt@${{ env.GOFUMPT_VERSION }}Installs gofumpt at the pinned version (v0.9.2)
4git ls-files '*.go' | xargs -r gofumpt -lLists non-conforming tracked Go files; on failure, prints unified diff and exits 1

The check uses git ls-files '*.go' | xargs -r gofumpt -l to collect non-conforming files from tracked sources only. If any are found, their paths are printed along with a unified diff (gofumpt -d), and the job exits 1. The -r flag prevents xargs from running gofumpt when no Go files are piped (GNU coreutils, available on ubuntu-latest).

Timeout: 5 minutes.

shellcheck

Validates shell scripts with shellcheck to catch scripting issues early. The shellcheck binary is pre-installed on ubuntu-latest runners.

StepActionDetails
1actions/checkout@v6Checks out the repository (SHA-pinned)
2shellcheck --severity=warning hack/*.shLints all shell scripts in hack/

Timeout: 5 minutes.

verify-invalid-cr-fixtures

Enforces the canonical-scaffold contract for the invalid-CR Chainsaw fixtures. Runs _generate.py --check (drift mode) and the test_generate.py unit suite (FIXTURES count + chainsaw-test.yaml cross-reference) so a hand-edit to any 02-…/03-…/…/12-*.yaml fixture, or a rename or removal that desynchronises FIXTURES from chainsaw-test.yaml, fails the build before the heavy cluster-bound e2e-operator job runs. Always-on because the check is sub-second and python3 is preinstalled on ubuntu-latest runners.

StepActionDetails
1actions/checkout@v6Checks out the repository (SHA-pinned)
2make verify-invalid-cr-fixturesRuns _generate.py --check and test_generate.py

Timeout: 5 minutes.

chainsaw-lint

Schema-lints every Chainsaw test (tests/**/chainsaw-test.yaml) and configuration (tests/{e2e,e2e-chaos}/chainsaw-config.yaml) via chainsaw lint so typos, removed fields, or schema drift after a chainsaw version bump fail fast — before the cluster-bound e2e-operator and e2e-chaos jobs spin up a kind cluster. Always-on because no cluster is needed: chainsaw is restored from the shared testdeps cache via the setup-test-deps composite action, the same one consumed internally by setup-e2e-infra. A schema break therefore surfaces in needs.*.result for both build-e2e-images and e2e-chaos.

StepActionDetails
1actions/checkout@v6Checks out the repository (SHA-pinned)
2./.github/actions/setup-test-depsRestores the testdeps cache and runs make install-test-deps (puts chainsaw on PATH)
3make chainsaw-lintRuns chainsaw lint test -f and chainsaw lint configuration -f over every matching file under tests/

Timeout: 5 minutes.

shell-unit-tests-cc-0100

Sub-second shell unit-test suite that pins down the WITH_PROMETHEUS gating contract. The suite is its own job (rather than steps folded into another job) because the tests are scoped to the kind-only opt-in plumbing — keeping them isolated keeps their failure signal unambiguous. The job runs unconditionally on every PR, so a regression in the gating contract is caught even when no prometheus paths have changed.

StepActionDetails
1actions/checkout@v6Checks out the repository (SHA-pinned)
2bash tests/unit/hack/deploy_infra_prometheus_flag_test.shAsserts hack/deploy-infra.sh stages keystone-operator.json, applies deploy/kind/prometheus/, and invokes enable_keystone_operator_servicemonitor only when WITH_PROMETHEUS=true
3bash tests/unit/ci/ci_path_filters_e2e_prometheus_test.shAsserts hack/ci-resolve-changes.sh emits e2e-prometheus=true for the documented trigger inputs

Timeout: 5 minutes.

A future cleanup will fold these tests into the existing make test-shell target once the unrelated docs-test bug is resolved.

test

Runs unit tests with a matrix strategy over [common, keystone, c5c3]. Each matrix leg tests a single target — either internal/common or one operator — producing a single coverage profile uploaded to Codecov under a dedicated flag.

StepActionDetails
1actions/checkout@v6Checks out the repository (SHA-pinned)
2actions/setup-go@v6Sets up Go with go-version-file: go.work
3make test-common or make test-operatorRuns unit tests for the matrix target
4codecov/codecov-action@v5Uploads coverage profile with target-specific flag

Matrix strategy:

yaml
strategy:
  fail-fast: false
  matrix:
    target: [common, keystone, c5c3]

The common leg runs make test-common (producing cover-unit-common.out). Operator legs run make test-operator OPERATOR=<target> (producing cover-unit-<operator>.out). This deduplicates common coverage into a single leg instead of uploading it under each operator flag.

Coverage upload:

yaml
files: cover-unit-${{ matrix.target }}.out
flags: unit-${{ matrix.target }}

The if: always() condition ensures coverage is uploaded even when tests fail, so partial coverage data is not lost.

test-integration

Runs envtest-based integration tests with a matrix strategy over [common, keystone, c5c3] and coverage uploaded to Codecov. Requires setup-envtest to download kubebuilder assets (kube-apiserver, etcd) for the test API server.

StepActionDetails
1actions/checkout@v6Checks out the repository (SHA-pinned)
2actions/setup-go@v6Sets up Go with go-version-file: go.work
3go install setup-envtest@release-0.23Installs envtest asset downloader (pinned to release branch)
4make test-integration-common or make test-integrationRuns integration tests for the matrix target
5codecov/codecov-action@v5Uploads coverage with integration-<target> flag

Matrix strategy:

yaml
strategy:
  fail-fast: false
  matrix:
    target: [common, keystone, c5c3]

The common leg runs make test-integration-common (producing cover-integration-common.out), which tests ./internal/common/... with -tags=integration. Operator legs run make test-integration OPERATOR=<target> (producing cover-integration-<operator>.out). Both targets set KUBEBUILDER_ASSETS via $(SETUP_ENVTEST) use <pinned-k8s-version> -p path.

Timeout: 15 minutes (longer than unit tests to account for envtest startup).

test-race

Runs all Go unit tests with the race detector enabled to catch data races in concurrent operator code — reconcilers, watches, informer caches. Separate from the main test job because the race detector adds 2–5x overhead. Uses -count=1 to disable test caching, since race conditions are non-deterministic and cached results could mask real races.

Dependencies: needs: [changes]Condition: if: needs.changes.outputs.go == 'true'Path filter: Go source files (same filter as test and test-integration)

StepActionDetails
1actions/checkout@v6Checks out the repository (SHA-pinned)
2actions/setup-go@v6Sets up Go with go-version-file: go.work
3make test-race RACE_FLAGS="-count=1"Delegates to the Makefile so the module list stays in sync

CI delegates to make test-race so the list of modules under race testing is defined in one place (the Makefile's OPERATORS variable and internal/common). RACE_FLAGS="-count=1" disables test caching — race conditions are non-deterministic, so cached results could mask real races. No continue-on-error or if: always() — a detected data race fails the job immediately.

This job runs independently and does not appear in any other job's needs: array. It is not on the critical path for E2E or publish jobs, so race detector overhead does not slow down the primary feedback loop. The corresponding local command is make test-race (which omits -count=1 via the default empty RACE_FLAGS for developer convenience).

Timeout: 20 minutes (accommodates 2–5x race detector overhead).

govulncheck

Scans all Go modules for reachable vulnerabilities using govulncheck, the official Go vulnerability scanner maintained by the Go team. Unlike dependency-list scanners, govulncheck analyses call graphs to detect only vulnerabilities in code paths that are actually reachable — reducing false positives. Catches supply-chain vulnerabilities at the PR stage, before container images are built.

Dependencies: needs: [changes]Condition: if: needs.changes.outputs.go == 'true'Path filter: Go source files (same filter as test, test-integration, and test-race)

StepActionDetails
1actions/checkout@v6Checks out the repository (SHA-pinned)
2actions/setup-go@v6Sets up Go with go-version-file: go.work
3go install golang.org/x/vuln/cmd/govulncheck@latestInstalls the latest govulncheck binary
4make govulncheckDelegates to the Makefile target, which iterates over internal/common and all $(OPERATORS) modules

govulncheck uses @latest intentionally — unlike other pinned tools (controller-gen, gofumpt), pinning govulncheck to an old version defeats the purpose of vulnerability scanning because the vulnerability database is updated frequently. This is a deliberate deviation from the general pinning policy, justified by the security tool's nature.

The CI step delegates to make govulncheck, which iterates over internal/common and each operator in the $(OPERATORS) Makefile variable. The Makefile target exits on the first module with a reachable vulnerability. govulncheck exits non-zero only for reachable vulnerabilities — dependencies with known CVEs whose vulnerable functions are not called in project code are reported as informational but do not fail the job.

This job runs independently and does not appear in any other job's needs: array. It is not on the critical path for E2E or publish jobs, matching the test-race pattern. When a new Go module is added to go.work, the OPERATORS variable in the Makefile must be updated with the new module name. The verification test (tests/ci/verify_govulncheck_modules.sh) catches drift between go.work and the Makefile automatically.

Timeout: 10 minutes.

verify-codegen

Verifies that generated code (CRD manifests, deepcopy functions) is committed and up-to-date. This is a gate job — it blocks merge alongside lint, test, and shellcheck.

StepActionDetails
1actions/checkout@v6Checks out the repository (SHA-pinned)
2actions/setup-go@v6Sets up Go with go-version-file: go.work
3go install controller-gen@${{ env.CONTROLLER_GEN_VERSION }}Installs the pinned code generator
4make manifests && make generateRegenerates CRD manifests and deepcopy functions
5make verify-crd-syncVerifies Helm chart CRD copies match controller-gen output
6git diff --exit-codeFails if any files changed (stale generated code)

When the diff check fails, the job produces a GitHub Actions ::error:: annotation with instructions to run make manifests && make generate locally and commit the result.

docs

Builds the VitePress documentation site to catch broken links and build errors.

Dependencies: needs: [changes]Condition: if: needs.changes.outputs.docs == 'true'Path filter: docs/**, package.json, package-lock.json

StepActionDetails
1actions/checkout@v6Full history (fetch-depth: 0) for git-based features
2actions/setup-node@v6Node.js 24, npm cache enabled
3npm ciInstalls dependencies from lockfile
4npm run docs:buildBuilds the documentation site

helm-validate

Validates Helm chart structure, template rendering, and unit tests without requiring a cluster. Runs helm lint, helm template with five value override scenarios, and helm unittest to catch chart regressions at PR time.

Dependencies: needs: [changes]Condition: if: needs.changes.outputs.helm == 'true'Path filter: operators/keystone/helm/** (forced true on v* tag pushes)

StepActionDetails
1actions/checkout@v6Checks out the repository (SHA-pinned)
2azure/setup-helm@v5Installs Helm CLI (SHA-pinned)
3helm plugin install helm-unittestInstalls helm-unittest plugin (pinned to v1.0.3)
4helm lintValidates chart structure and syntax for operators/keystone/helm/keystone-operator/
5helm template (5 scenarios)Renders chart with value overrides to catch broken conditionals and invalid YAML
6helm unittestRuns unit test suites from operators/keystone/helm/keystone-operator/tests/

Template scenarios (step 5):

ScenarioValuesPurpose
1 — default values(none)Validates baseline rendering with chart defaults
2 — webhook disabledwebhook.enabled=falseValidates conditional exclusion of webhook resources
3 — external service accountserviceAccount.create=false, serviceAccount.name=existing-saValidates ServiceAccount conditional logic
4 — custom resourcesresources.limits.cpu=100m, resources.limits.memory=64MiValidates resource override wiring
5 — namespace-scoped RBACrbac.namespaceScoped=true, webhook.enabled=falseValidates Role/RoleBinding rendering instead of ClusterRole/ClusterRoleBinding

Unit test suites (step 6):

Test FileTemplate Under TestKey Assertions
deployment_test.yamldeployment.yamlImage, replicas, resources, securityContext, probes, args, conditional webhook volume mount
clusterrole_test.yamlclusterrole.yamlAll 14 RBAC rule blocks with correct verbs
clusterrolebinding_test.yamlclusterrolebinding.yamlroleRef and ServiceAccount subject binding
service_test.yamlservice.yamlMetrics port (8080), conditional webhook port (443→9443)
serviceaccount_test.yamlserviceaccount.yamlConditional creation (create=true/false), custom name override, standard labels
webhook_test.yamlwebhook-configuration.yamlMutating/Validating configs when enabled, absent when disabled, cert-manager annotation
certificate_test.yamlcertificate.yamlIssuer and Certificate when enabled, absent when disabled, DNS names, issuer reference

Timeout: 10 minutes.

e2e-infra

End-to-end infrastructure deployment and Chainsaw test. Deploys the full infrastructure stack (Flux, cert-manager, MariaDB, ESO, OpenBao) to a kind cluster and validates health of all operators, CRs, and ExternalSecrets.

Dependencies: needs: [changes]Condition: if: needs.changes.outputs.e2e-infra == 'true'

StepActionDetails
1actions/checkout@v6Checks out the repository (SHA-pinned)
2actions/setup-go@v6Sets up Go with go-version-file: go.work
3helm/kind-action@v1.14.0Creates kind cluster (forge-e2e)
4setup-e2e-infra composite actionInstalls Flux CLI, test deps, and deploys infra stack
5chainsaw testRuns E2E tests from tests/e2e/infrastructure/
6hack/ci-dump-diagnostics.sh (on failure)Dumps HelmReleases, pods, events, Flux logs
7Upload JUnit reportUploads test results as artifact (14-day retention)

Timeout: 20 minutes.

build-e2e-images

Centralised image build for E2E test jobs. Builds all Docker images (base, operator, service, tempest) once and pushes them to GHCR under run-scoped tags (e2e-${run_id}-<orig_tag>). The e2e-operator, e2e-chaos, and tempest jobs docker pull from GHCR via the load-e2e-images composite action instead of rebuilding, saving ~5-10 min per CI run.

Dependencies: needs: [changes, lint, shellcheck, test, test-integration, verify-codegen, verify-invalid-cr-fixtures, chainsaw-lint]

Condition: Runs only when has-e2e-operators == 'true' (or e2e-chaos == 'true') and no gate job failed. Uses always() so the job runs when upstream Go jobs are skipped (e.g. pure E2E test-definition PRs where go=false). Skipped on PRs from forks (the workflow's GITHUB_TOKEN is read-only on packages: for forked pull_request events, so GHCR push would fail) — see github.event.pull_request.head.repo.fork guard.

Permissions: contents: read, packages: write (required for GHCR push).

StepActionDetails
1actions/checkout@v6Checks out the repository (SHA-pinned)
2docker/setup-buildx-action@v4Sets up BuildKit for type=gha cache support
3docker/login-action@v4Authenticates to GHCR with GITHUB_TOKEN
4Resolve build operatorsUnions e2e-operators with a fixed keystone entry (required by tempest)
5Build base imagesBuilds python-base and venv-builder (reused by subsequent builds)
6Build operator imagesBuilds <IMAGE_PREFIX>/<op>-operator:dev for each resolved operator
7Build service imagesBuilds <IMAGE_PREFIX>/<op>:<release> for each operator x release combination
8Build Tempest imagesBuilds <IMAGE_PREFIX>/tempest:<release> for all releases
9Push E2E images to GHCRFor each image, docker tag to <repo>:e2e-${run_id}-<orig_tag> and docker push

The "Resolve build operators" step guarantees that keystone is always in the build set. This is required because the tempest job hardcodes keystone-operator:dev and keystone:<release> — without the union, a pipeline triggered by a different operator (e.g. glance) would fail tempest due to missing keystone images.

GH-310 replaced the previous docker save | zstd | upload-artifact transport with GHCR push/pull because the 355 MB single-blob artifact intermittently timed out at the 5-minute actions/download-artifact window. Layer-level pull retries plus the GHCR CDN dramatically reduce the failure rate.

Timeout: 30 minutes.

e2e-operator

End-to-end operator test using kind cluster and Chainsaw. Pulls pre-built images from GHCR via the load-e2e-images composite action, deploys the infrastructure stack and operator via Helm, and runs Chainsaw E2E test suites.

Dependencies: needs: [changes, build-e2e-images]Condition: Runs only when has-e2e-operators == 'true' and build-e2e-images succeeded. Permissions: contents: read, packages: read (required for GHCR pull).

StepActionDetails
1actions/checkout@v6Checks out the repository (SHA-pinned)
2actions/setup-go@v6Sets up Go with go-version-file: go.work
3helm/kind-action@v1.14.0Creates kind cluster (forge-e2e)
4load-e2e-images composite actionPulls run-scoped GHCR tags and re-tags to canonical local refs
5kind load docker-imageLoads operator, 2025.2 service, 2025.2-upgraded, and 2026.1 service images into kind
6setup-e2e-infra composite actionInstalls Flux CLI, test deps, and deploys infra stack
7hack/ci-deploy-operator.shInstalls CRDs and deploys operator via Helm
8chainsaw testRuns E2E tests from tests/e2e/<operator>/
9hack/ci-dump-diagnostics.sh (always)Dumps operator pods, all pods, events, operator logs
10Upload JUnit reportUploads test results as artifact (14-day retention)

Matrix strategy:

yaml
strategy:
  fail-fast: false
  matrix: ${{ fromJson(needs.changes.outputs.e2e-operators) }}

The operator matrix is dynamically constructed by the changes job, including only operators whose code (or shared code) changed. The imagePullPolicy: Never Helm value ensures the kind-loaded image is used instead of attempting a registry pull. Timeout: 45 minutes.

e2e-chaos

End-to-end chaos tests using kind cluster, Chaos Mesh, and Chainsaw. Pulls the keystone operator and service images from GHCR via the load-e2e-images composite action, deploys them alongside Chaos Mesh infrastructure, and runs the chaos test suites (MariaDB pod kill, Memcached pod kill, OpenBao pod kill, MariaDB network partition, MariaDB network latency). See Chaos E2E Test Suites for test suite details.

Dependencies: needs: [changes, lint, shellcheck, test, test-integration, verify-codegen, chainsaw-lint, build-e2e-images]Condition: Runs only when e2e-chaos == 'true' or the PR has a run-chaos label, build-e2e-images succeeded, and no dependency failed or was cancelled. Permissions: contents: read, packages: read (required for GHCR pull).

The e2e-chaos job depends on the standard gate jobs. The e2e-operator dependency was removed so chaos tests run in parallel with operator E2E tests, reducing overall CI wall time. The job uses continue-on-error: true while chaos test stability is being proven in CI — failures are visible but do not block merges or the publish pipeline. This will be revisited after 2–4 weeks of successful CI runs.

StepActionDetails
1actions/checkout@v6Checks out the repository (SHA-pinned)
2helm/kind-action@v1.14.0Creates kind cluster (forge-e2e)
3load-e2e-images composite actionPulls run-scoped GHCR tags and re-tags to canonical local refs
4kind load docker-imageLoads keystone operator and 2025.2 service images into kind
5setup-e2e-infra composite actionInstalls Flux CLI, test deps, and deploys infra stack with WITH_CHAOS_MESH=true
6hack/ci-deploy-operator.shInstalls CRDs and deploys keystone operator via Helm
7chainsaw testRuns chaos E2E tests from tests/e2e-chaos/ with tests/e2e-chaos/chainsaw-config.yaml
8hack/ci-dump-diagnostics.sh (always)Dumps operator pods, all pods, events, operator logs with OPERATOR=keystone
9Upload JUnit reportUploads _output/reports/ as e2e-chaos-junit-report-<suite> artifact (14-day retention)

Key differences from e2e-operator:

Aspecte2e-operatore2e-chaos
MatrixDynamic per-operatorSingle job (keystone only)
Test configtests/e2e/chainsaw-config.yamltests/e2e-chaos/chainsaw-config.yaml
Test directorytests/e2e/<operator>/tests/e2e-chaos/
Timeout45 minutes45 minutes
BlockingYesNo (continue-on-error: true)
DependenciesGate jobsGate jobs
Service images2025.2 + 2025.2-upgraded + 2026.12025.2 only

The chaos test Chainsaw config uses parallel: 1 (serial execution) because chaos tests mutate shared infrastructure pod availability. The assert timeout is 300s (vs 120s for happy-path tests) to allow multiple reconciliation cycles and pod restart time during fault recovery.

Path filter: tests/e2e-chaos/**, hack/**, deploy/**, .github/workflows/ci.yaml, .github/actions/** (separate from e2e_infra to allow independent gating). Additionally, any Go code change — operator-specific (e.g., operators/keystone/**/*.go) or shared (internal/common/**/*.go via go_common) — triggers the job via go_changed in ci-resolve-changes.sh, since chaos tests validate operator resilience against the current codebase.

e2e-prometheus

End-to-end kube-prometheus-stack tests using kind cluster, Flux-managed kube-prometheus-stack HelmRelease, and Chainsaw. Builds the keystone operator image, deploys it alongside the monitoring stack, and runs the prometheus suite under tests/e2e/keystone/prometheus-stack/ to verify HelmRelease readiness, ServiceMonitor presence, and live Prometheus scraping of the operator metrics endpoint.

Dependencies: needs: [changes, lint, shellcheck, test, test-integration, verify-codegen, chainsaw-lint, build-e2e-images]Condition: Runs only when e2e-prometheus == 'true', the upstream build-e2e-images job succeeded, and no dependency failed or was cancelled.

The setup-e2e-infra composite action is invoked with WITH_PROMETHEUS: "true" in its step env, which threads through to hack/deploy-infra.sh and gates the kube-prometheus-stack overlay (deploy/kind/prometheus/) plus the post-deploy enable_keystone_operator_servicemonitor patch. The Deploy operator step runs hack/ci-deploy-operator.sh with WITH_PROMETHEUS: "true" in its step env, which adds --set monitoring.serviceMonitor.enabled=true to the Helm install command — without this flag the chart's gated ServiceMonitor template renders nothing and the chainsaw step servicemonitor-exists (and the dependent prometheus-target-up) cannot pass. The kind base kustomization keeps the keystone-operator HelmRelease suspended, so the runtime kubectl patch cannot reactively enable the ServiceMonitor — the install-time flag is the single source of truth.

Unlike e2e-chaos, e2e-prometheus runs with continue-on-error: false: the kube-prometheus stack is deterministic on kind, so any failure is a genuine regression of the kind-only Quick Start observability story.

StepActionDetails
1actions/checkout@v6Checks out the repository (SHA-pinned)
2helm/kind-action@v1.14.0Creates kind cluster (forge-e2e)
3load-e2e-images compositeRestores prebuilt operator and service images from the build-e2e-images artifact
4kind load docker-imageLoads operator and service images into kind
5setup-e2e-infra composite actionInstalls Flux CLI, test deps, and deploys infra stack with WITH_PROMETHEUS: "true"
6hack/ci-deploy-operator.shInstalls CRDs and deploys keystone operator via Helm with WITH_PROMETHEUS: "true" (gates --set monitoring.serviceMonitor.enabled=true)
7chainsaw testRuns the prometheus E2E suite from tests/e2e/keystone/prometheus-stack/
8hack/ci-dump-diagnostics.sh (always)Dumps operator pods, all pods, events, operator logs with OPERATOR=keystone
9Upload JUnit reportUploads _output/reports/ as e2e-prometheus-junit-report artifact (14-day retention)

Path filter: deploy/kind/prometheus/**, tests/e2e/keystone/prometheus-stack/**, hack/**, deploy/**, .github/workflows/ci.yaml, .github/actions/**. As with e2e-chaos, any Go code change (go_changed) or any E2E test change (any_e2e_tests) also triggers the job via ci-resolve-changes.sh, since the prometheus suite scrapes live operator metrics.

tempest

Tempest API integration tests. Deploys services into a kind cluster and runs the OpenStack Tempest test suite against them. Uses a release matrix to validate each OpenStack release independently, with per-release Tempest configuration, Keystone CRs, and K8s service names. Pulls pre-built images from GHCR (run-scoped tag) via the load-e2e-images composite action.

Dependencies: needs: [changes, build-e2e-images]Condition: Runs only when has-e2e-operators == 'true' and build-e2e-images succeeded. Permissions: contents: read, packages: read (required for GHCR pull).

Matrix strategy:

yaml
strategy:
  fail-fast: false
  matrix:
    include:
      - release: "2025.2"
        config-dir: tests/tempest/keystone
        cr-name: keystone-tempest
        service-k8s-name: keystone-tempest-api
      - release: "2026.1"
        config-dir: tests/tempest/keystone-2026-1
        cr-name: keystone-tempest-2026-1
        service-k8s-name: keystone-tempest-2026-1-api

Each matrix entry specifies: the release version, the Tempest configuration directory, the Keystone CR name, and the K8s service name used for port-forwarding. Steps reference these via matrix.release, matrix.config-dir, matrix.cr-name, and matrix.service-k8s-name.

StepActionDetails
1actions/checkout@v6Checks out the repository (SHA-pinned)
2actions/setup-go@v6Sets up Go with go-version-file: go.work
3helm/kind-action@v1.14.0Creates kind cluster (forge-e2e)
4load-e2e-images composite actionPulls run-scoped GHCR tags and re-tags to canonical local refs
5kind load docker-imageLoads keystone operator and service images into kind
6setup-e2e-infra composite actionInstalls Flux CLI, test deps, and deploys infra stack
7hack/ci-deploy-operator.shInstalls CRDs and deploys operator via Helm
8Deploy Keystone CRApplies matrix.config-dir/00-keystone-cr.yaml and waits for matrix.cr-name Ready
9hack/ci-run-tempest.shRuns Tempest API tests with CONFIG_DIR=matrix.config-dir, SERVICE_K8S_NAME=matrix.service-k8s-name
10Upload Tempest resultsUploads _output/tempest/ as tempest-<release>-results artifact (14-day retention)
11hack/ci-dump-diagnostics.sh (always)Dumps diagnostic info with OPERATOR=keystone

Timeout: 45 minutes.

cleanup-e2e-tags

GH-310. Prunes the run-scoped GHCR tags pushed by build-e2e-images (e2e-${run_id}-*) so they don't accumulate on the package page. Runs as a matrix over each E2E target package (keystone-operator, keystone, tempest) after every consumer that might still pull the images has finished. The always() && needs.build-e2e-images.result == 'success' condition means the cleanup runs on success, failure, cancelled, or skipped consumer outcomes — but only when build-e2e-images actually pushed something.

Dependencies: needs: [build-e2e-images, e2e-operator, e2e-chaos, tempest]Permissions: contents: read, packages: write

The nightly cleanup-e2e-stale-tags job in cleanup-images.yaml is the safety net: if a workflow is cancelled before cleanup-e2e-tags fires, that job deletes any e2e-* tag older than one day across the same package set.

Timeout: 10 minutes.

build-and-push

Builds operator container images per platform on native runners and pushes each single-arch image by digest. Runs only on push events (main branch or v* tags) — skipped on pull requests. The multi-arch manifest list and final tags are assembled by the subsequent merge-operator-images job.

Dependencies: needs: [changes, e2e-operator]Condition: if: github.event_name == 'push' && needs.e2e-operator.result == 'success'Permissions: contents: read, packages: write

StepActionDetails
1actions/checkout@v6Checks out the repository (SHA-pinned)
2Prepare platform pairShell
3docker/setup-buildx-action@v4Sets up Docker Buildx
4docker/login-action@v4Authenticates to GHCR (github.actor / GITHUB_TOKEN)
5docker/metadata-action@v6Generates OCI labels (two-layer annotation pattern)
6docker/build-push-action@v7Builds single-platform image; push-by-digest=true; digest exported as artifact
7Export digestShell
8Upload digestactions/upload-artifact@v7

Matrix strategy:

yaml
strategy:
  fail-fast: false
  matrix:
    operator: ${{ fromJson(needs.changes.outputs.e2e-operators).operator }}
    platform: [linux/amd64, linux/arm64]
    include:
      - platform: linux/amd64
        runner: ubuntu-latest
      - platform: linux/arm64
        runner: ubuntu-24.04-arm

Build context is the repository root (required by go.work), with the Dockerfile at operators/<operator>/Dockerfile. GitHub Actions cache (type=gha) is scoped per platform (<operator>-operator-linux-amd64 / <operator>-operator-linux-arm64).

merge-operator-images

Downloads per-platform digests from build-and-push, assembles the multi-arch manifest list, and pushes it with the final tags.

Dependencies: needs: [changes, build-and-push]Condition: if: github.event_name == 'push' && needs.build-and-push.result == 'success'Permissions: contents: read, packages: write

StepActionDetails
1actions/checkout@v6Checks out the repository (SHA-pinned)
2docker/setup-buildx-action@v4 + docker/login-action@v4Authenticates to GHCR
3docker/metadata-action@v6Generates final image tags
4Download digestsactions/download-artifact@v4
5Create and push manifest listShell

Matrix strategy: Same operator dimension as build-and-push (via fromJson(needs.changes.outputs.e2e-operators)).

Image tagging strategy:

TriggerTags Applied
Push to mainsha-<full-sha>, latest
Push v* tag (from main)sha-<full-sha>, latest, <version> (e.g. 0.1.0, v prefix stripped)
Push v* tag (from non-main)sha-<full-sha>, <version> (no latest — restricted to default branch)

Images are published at ghcr.io/c5c3/<operator>-operator:<tag>.

helm-push

Packages and pushes operator Helm charts to the GHCR OCI registry. Runs only on push events — skipped on pull requests.

Dependencies: needs: [changes, e2e-operator]Condition: if: github.event_name == 'push' && needs.e2e-operator.result == 'success'Permissions: contents: read, packages: write

StepActionDetails
1actions/checkout@v6Checks out the repository (SHA-pinned)
2azure/setup-helm@v4Installs Helm CLI
3Helm registry loginAuthenticates to GHCR via helm registry login
4Package and pushPackages chart and pushes to oci://ghcr.io/c5c3/charts/

Chart version derivation:

TriggerVersion
Push to mainDefault version from Chart.yaml
Push v* tagSemVer derived from tag (v prefix stripped, e.g. v0.1.00.1.0)

Matrix strategy:

yaml
strategy:
  matrix:
    operator: [keystone]

The make helm-package target packages operators/<operator>/helm/<operator>-operator/. When CHART_VERSION is set (for tag pushes), it overrides the version in Chart.yaml.

github-release

Creates a GitHub Release with auto-generated release notes on v* tag pushes.

Dependencies: needs: [changes, merge-operator-images, helm-push]Condition: if: startsWith(github.ref, 'refs/tags/v') && needs.merge-operator-images.result == 'success' && needs.helm-push.result == 'success'Permissions: contents: write

StepActionDetails
1actions/checkout@v6Checks out the repository (SHA-pinned)
2azure/setup-helm@v4Installs Helm CLI for chart packaging
3Package Helm chartsPackages operator Helm charts with release version
4softprops/action-gh-release@v2Creates release with generate_release_notes: true and attaches chart tarballs

This job runs only after both merge-operator-images and helm-push complete successfully, ensuring the final multi-arch manifest list and charts are published before the release is created. Helm chart tarballs are attached as release assets for direct download. Timeout: 5 minutes.

Reusable CI Scripts

Repeated inline shell logic from E2E jobs is extracted into standalone scripts under hack/. Each script uses set -euo pipefail, includes an SPDX Apache-2.0 header, and passes shellcheck. All scripts are designed to work both in CI and locally against any kubeconfig.

hack/ci-dump-diagnostics.sh

Dumps diagnostic information after E2E failures. Shared across e2e-infra, e2e-operator, and tempest jobs.

Environment VariableRequiredDefaultDescription
OPERATORNo(empty)When set, emits operator-specific diagnostics (pod logs, CR status, job logs)
NAMESPACENoopenstackKubernetes namespace for operator-specific queries

Infrastructure diagnostics (always emitted): HelmReleases, pods, DaemonSets, events (last 50), and Flux logs across all namespaces.

Operator diagnostics (when OPERATOR is set): Operator pods and logs, job descriptions and logs in the target namespace, all pod logs (current and previous) in the namespace, operator CR status conditions, and ConfigMaps.

Usage:

bash
hack/ci-dump-diagnostics.sh                    # infra-only diagnostics
OPERATOR=keystone hack/ci-dump-diagnostics.sh   # + operator-specific diagnostics

hack/ci-build-service-image.sh

Builds an OpenStack service container image by resolving upstream source refs, cloning the project at the pinned ref, applying constraint overrides, and building the full image chain (python-base -> venv-builder -> service image).

Environment VariableRequiredDefaultDescription
OPERATORYes-OpenStack service name (e.g. keystone)
IMAGE_PREFIXYes-Container image prefix (e.g. ghcr.io/c5c3)
RELEASENo2025.2Release directory name under releases/

The script reads releases/<RELEASE>/source-refs.yaml for the upstream Git ref and releases/<RELEASE>/extra-packages.yaml for additional pip/apt packages. The final image is tagged <IMAGE_PREFIX>/<OPERATOR>:<RELEASE>.

Usage:

bash
OPERATOR=keystone IMAGE_PREFIX=ghcr.io/c5c3 hack/ci-build-service-image.sh

hack/ci-deploy-operator.sh

Deploys an operator into a kind cluster by installing CRDs, waiting for establishment, and deploying the operator via Helm with the specified container image.

Environment VariableRequiredDefaultDescription
OPERATORYes-Operator name (e.g. keystone)
IMAGE_REPOYes-Full image repository (e.g. ghcr.io/c5c3/keystone-operator)
IMAGE_TAGNodevImage tag

The script runs kubectl apply -f <chart>/crds/, waits for CRD establishment, then runs helm install with image.pullPolicy=Never (suitable for kind-loaded images).

Usage:

bash
OPERATOR=keystone IMAGE_REPO=ghcr.io/c5c3/keystone-operator hack/ci-deploy-operator.sh

hack/ci-build-tempest-image.sh

Builds the Tempest test container image by resolving Tempest and plugin version refs from the release config, then running docker build with the pinned versions.

Environment VariableRequiredDefaultDescription
RELEASENo2025.2Release directory name under releases/
TEMPEST_IMAGENoc5c3/tempest:localTarget image name:tag

The script reads releases/<RELEASE>/test-refs.yaml to resolve tempest and keystone-tempest-plugin versions, then builds images/tempest/Dockerfile with the appropriate build args and upper-constraints build context.

Usage:

bash
hack/ci-build-tempest-image.sh
RELEASE=2025.2 TEMPEST_IMAGE=c5c3/tempest:local hack/ci-build-tempest-image.sh

hack/ci-run-tempest.sh

CI-specific Tempest execution wrapper that handles port-forwarding, config generation, and Docker-based test execution. This is the CI counterpart to hack/run-tempest.sh (which handles local execution including image building).

Environment VariableRequiredDefaultDescription
SERVICENokeystoneService under test
CONFIG_DIRNotests/tempest/<SERVICE>Directory containing tempest.conf template and include/exclude lists
NAMESPACENoopenstackKubernetes namespace
ADMIN_SECRETNokeystone-adminSecret name holding admin password
OUTPUT_DIRNo_output/tempestTest output directory
TEMPEST_IMAGENoc5c3/tempest:localTempest container image
SERVICE_K8S_NAMENo<SERVICE>-tempest-apiK8s Service name for port-forwarding (allows override for release-specific CR names, e.g. keystone-tempest-2026-1-api)

The script:

  1. Extracts the admin password from the Kubernetes secret
  2. Sets up kubectl port-forward to the service and waits for readiness
  3. Generates tempest.conf from the template, substituting endpoint and credentials
  4. Runs Tempest in a Docker container with --network host and host-alias DNS entries
  5. Converts subunit output to JUnit XML and checks for failures

Usage:

bash
hack/ci-run-tempest.sh
SERVICE=keystone OUTPUT_DIR=_output/tempest hack/ci-run-tempest.sh

Composite Action: setup-test-deps

.github/actions/setup-test-deps/action.yaml

A composite GitHub Action that encapsulates the shared cache + make install-test-deps step used by every job that needs the pinned chainsaw/flux/kind/kubectl binaries. Extracted so the cache key, restore-keys:, and PATH wiring live in one place: setup-e2e-infra (cluster-bound jobs) and the lightweight chainsaw-lint job both consume this and inherit any future tweaks (key bump, additional pinned tool) for free.

StepDescription
1Restores $HOME/.local/bin from cache, keyed on the hash of hack/install-test-deps.sh (auto-invalidates when any pinned tool version changes)
2Runs make install-test-deps (no-op on cache hit thanks to the script's skip-if-correct-version logic) and appends ~/.local/bin to GITHUB_PATH

The action takes no inputs.

Composite Action: setup-e2e-infra

.github/actions/setup-e2e-infra/action.yaml

A composite GitHub Action that encapsulates the shared Flux CLI + test dependencies + infrastructure deployment sequence used by e2e-infra, e2e-operator, and tempest jobs. This replaces three duplicated step sequences with a single uses: reference.

Prerequisite: A kind cluster must already exist (the action sets SKIP_KIND_CREATE=true internally).

StepDescription
1Installs Flux CLI via fluxcd/flux2/action@v2.8.3 (SHA-pinned)
2Delegates to the setup-test-deps composite action (cache restore + make install-test-deps + PATH wiring)
3Runs make deploy-infra with SKIP_KIND_CREATE=true

Usage in a workflow job:

yaml
- name: Setup E2E infrastructure
  uses: ./.github/actions/setup-e2e-infra

The action takes no inputs. All configuration is handled by existing Makefile targets and environment variables.

Composite Action: load-e2e-images

.github/actions/load-e2e-images/action.yaml

A composite GitHub Action that pulls pre-built E2E images from GHCR (under the run-scoped tag pushed by build-e2e-images) and re-tags them to their canonical local references so downstream kind load docker-image calls work unchanged. Shared between e2e-operator, e2e-chaos, and tempest jobs.

StepDescription
1docker/login-action@v4 authenticates to GHCR using the workflow's GITHUB_TOKEN
2For each input ref, docker pull <repo>:e2e-${run_id}-<orig_tag> then docker tag to the canonical local ref
InputDefaultDescription
run-id${{ github.run_id }}Run ID used as the tag prefix (e2e-<run-id>-)
images(required)Multiline list of canonical local refs (e.g. ghcr.io/c5c3/keystone:2025.2); blank/comment lines are ignored
registryghcr.ioRegistry to authenticate against
username${{ github.actor }}Login user
password${{ github.token }}Login token

Usage in a workflow job:

yaml
- name: Load E2E images
  uses: ./.github/actions/load-e2e-images
  with:
    images: |
      ${{ env.IMAGE_PREFIX }}/keystone-operator:dev
      ${{ env.IMAGE_PREFIX }}/keystone:2025.2

GH-310 replaced the previous actions/download-artifact + zstd | docker load sequence: the 355 MB single-blob artifact intermittently timed out at the five-minute window (actions/download-artifact has no built-in retry on a stalled download). Layer-level pull retries plus the GHCR CDN dramatically reduce the failure rate.

How the Pieces Fit Together

The E2E jobs follow a common pattern with shared components:

1. Checkout + Go setup + kind cluster creation     (workflow steps)
2. Pull pre-built images from GHCR                  (load-e2e-images composite action)
3. Load images into kind                            (workflow steps)
4. Deploy infrastructure                            (setup-e2e-infra composite action)
5. Deploy operator                                  (hack/ci-deploy-operator.sh)
6. Run tests                                        (chainsaw / hack/ci-run-tempest.sh)
7. Dump diagnostics                                 (hack/ci-dump-diagnostics.sh)
8. Upload artifacts                                 (workflow steps)

Image building is centralised in build-e2e-images, which runs once before the E2E jobs and pushes every image to GHCR under a run-scoped tag. The e2e-infra job uses steps 1, 4, 6-8 (no operator or service images needed). The e2e-operator, e2e-chaos, and tempest jobs use all steps, pulling their required images from GHCR via load-e2e-images. The e2e-chaos job uses a chaos-specific Chainsaw config (tests/e2e-chaos/chainsaw-config.yaml) and test directory (tests/e2e-chaos/). The tempest job additionally deploys a Keystone CR before running hack/ci-run-tempest.sh instead of Chainsaw. The cleanup-e2e-tags job prunes the run-scoped tags after every consumer finishes.

Go Setup Convention

All Go-based jobs use actions/setup-go@v6 with:

yaml
go-version-file: go.work

This reads the Go version from go.work (currently Go 1.25.0) rather than hardcoding a go-version value. The repository root contains go.work (not go.mod) because the project uses a Go Workspace with multiple modules (internal/common, operators/keystone, operators/c5c3). Module dependency caching is enabled by default in actions/setup-go@v6.

Concurrency

The workflow uses a concurrency group scoped per-branch per-workflow:

yaml
concurrency:
  group: ${{ github.ref }}-${{ github.workflow }}
  cancel-in-progress: ${{ github.event_name == 'pull_request' }}

For pull requests, pushing new commits cancels any in-progress CI run for that same PR branch, preventing wasted CI resources on outdated code. For pushes to main, in-progress runs are not cancelled, ensuring every merge commit is fully validated. Different branches do not cancel each other's runs.

Action Pinning

All GitHub Actions are referenced by full SHA hash with a trailing version comment:

yaml
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6

This prevents supply chain attacks via mutable tag retargeting and provides audit traceability. The version comment preserves human readability.

SPDX Header

The file starts with the standard SPDX license header:

text
# SPDX-FileCopyrightText: Copyright 2026 SAP SE or an SAP affiliate company
#
# SPDX-License-Identifier: Apache-2.0
---

Codecov Configuration

.codecov.yml defines coverage status checks and component-level thresholds.

Status Checks

CheckTargetDescription
Projectauto (threshold: 1%)Overall coverage must not decrease by more than 1%
Patch90%New/changed lines in a PR must meet 90% coverage

fail_ci_if_error: false is set on each codecov/codecov-action step in the workflow (not in .codecov.yml, where it is not a valid key) because fork PRs do not have access to CODECOV_TOKEN. This prevents CI from failing due to upload issues on forks.

Flag Management

The flag_management section in .codecov.yml links CI-uploaded flags to coverage tracking rules. Flags follow the [unit|integration]-<target> naming convention, matching the CI matrix targets (common, keystone, c5c3). Each flag has carryforward: true, which ensures that when only a subset of flags is uploaded (e.g., only one operator changed), the missing flags carry forward their last-known coverage instead of reducing the total.

Defined flags:

FlagPathsSource
unit-commoninternal/common/test job, common matrix leg
unit-keystoneoperators/keystone/test job, keystone matrix leg
unit-c5c3operators/c5c3/test job, c5c3 matrix leg
integration-commoninternal/common/test-integration job, common matrix leg
integration-keystoneoperators/keystone/test-integration job, keystone matrix leg
integration-c5c3operators/c5c3/test-integration job, c5c3 matrix leg

Component Thresholds

Each component is tracked independently on the Codecov dashboard:

ComponentPathsTargetRationale
commoninternal/common/**80%Shared library code underpinning all operators
controllersoperators/*/internal/controller/**70%Controller reconciliation logic (envtest-dependent paths harder to cover)
webhooksoperators/*/api/**90%Webhook validation/defaulting (incorrect admission logic causes silent data corruption)

Makefile Targets

The CI workflow depends on several Makefile targets:

docker-build

Builds the operator Docker image from operators/<operator>/Dockerfile with the repository root as build context (required by go.work).

make docker-build OPERATOR=keystone [IMG=custom:tag]

The IMG variable controls the image tag, defaulting to ghcr.io/c5c3/<operator>-operator:latest. The OPERATOR variable is required.

helm-package

Packages the operator Helm chart from operators/<operator>/helm/<operator>-operator/.

make helm-package OPERATOR=keystone [CHART_VERSION=1.2.3]

When CHART_VERSION is set, it overrides the version in the chart's Chart.yaml. The packaged .tgz is output to the current directory. The OPERATOR variable is required.

test-common

Runs unit tests for internal/common only, producing a single coverage profile.

make test-common

Produces cover-unit-common.out. Used by the common matrix leg in the test CI job to deduplicate common coverage into a single upload.

test-operator

Runs unit tests for a single operator without internal/common.

make test-operator OPERATOR=keystone

Produces cover-unit-<operator>.out. Used by operator matrix legs in the test CI job. The OPERATOR variable is required.

test-integration

Runs envtest-based integration tests (tagged with //go:build integration) for operators. Requires setup-envtest to be installed.

make test-integration [OPERATOR=keystone]

Sets KUBEBUILDER_ASSETS via setup-envtest use <pinned-k8s-version> -p path, then runs go test -tags=integration for each operator module. Produces cover-integration-<operator>.out files. Without OPERATOR, runs for all operators in the OPERATORS list.

test-integration-common

Runs envtest-based integration tests for internal/common only.

make test-integration-common

Sets KUBEBUILDER_ASSETS via setup-envtest use <pinned-k8s-version> -p path, then runs go test -tags=integration ./internal/common/.... Produces cover-integration-common.out. Used by the common matrix leg in CI to meet the 80% codecov target for internal/common/.

Dependencies on Prior Features

The CI workflow depends on the following artifacts:

ArtifactUsed byPurpose
Makefile (lint target)lint jobIterates over OPERATORS variable to run golangci-lint per module
Makefile (test-common target)test job (common leg)Runs unit tests for internal/common with coverage profile
Makefile (test-operator target)test job (operator legs)Runs unit tests for a single operator with coverage profile
Makefile (test-integration target)test-integration job (operator legs)Runs envtest integration tests per operator with coverage profiles
Makefile (test-integration-common target)test-integration job (common leg)Runs envtest integration tests for internal/common with coverage profile
Makefile (docker-build target)build-e2e-images, e2e-chaos, build-and-push jobsBuilds operator Docker images
Makefile (helm-package target)helm-push jobPackages operator Helm charts
.golangci.ymllint jobProvides linter configuration (enabled linters, exclusion rules, timeout)
go.workAll Go-based jobsProvides the Go version for actions/setup-go@v6
hack/*.shshellcheck jobShell scripts validated by shellcheck
.codecov.ymlCodecov integrationComponent-level coverage thresholds
hack/ci-dump-diagnostics.she2e-infra, e2e-operator, e2e-chaos, tempest jobsShared diagnostic dump
hack/ci-build-service-image.she2e-operator, e2e-chaos, tempest jobsBuilds OpenStack service images
hack/ci-deploy-operator.she2e-operator, e2e-chaos, tempest jobsDeploys operator via Helm
hack/ci-run-tempest.shtempest jobRuns Tempest API tests
.github/actions/setup-test-deps/chainsaw-lint job, setup-e2e-infra composite actionComposite action for testdeps cache + make install-test-deps
.github/actions/setup-e2e-infra/e2e-infra, e2e-operator, e2e-chaos, tempest jobsComposite action for infra setup
.github/actions/load-e2e-images/e2e-operator, e2e-chaos, tempest jobsComposite action that pulls run-scoped GHCR tags and re-tags them to canonical local refs (GH-310)
.github/actions/cleanup-ghcr-package/cleanup-e2e-tags job, cleanup-images.yamlWraps dataaxiom/ghcr-cleanup-action for delete-by-pattern and delete-by-exclusion modes
tests/e2e-chaos/chainsaw-config.yamle2e-chaos jobChaos-specific Chainsaw configuration