Skip to content

Validation

Configuration validation in CobaltCore operates across three layers, each catching different classes of errors at progressively later stages. Together, they provide defense-in-depth against misconfigurations. For the config generation pipeline that precedes validation, see Config Generation.

Validation Layers

text
┌─────────────────────────────────────────────────────────────────────────────┐
│                       VALIDATION LAYERS                                     │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  ┌───────────────────────────────────────────────────────────────────────┐  │
│  │                                                                       │  │
│  │  Layer 3: RUNTIME VALIDATION                                          │  │
│  │  oslo.config parsing, service startup checks, readiness probes        │  │
│  │                                                                       │  │
│  │  ┌───────────────────────────────────────────────────────────────┐    │  │
│  │  │                                                               │    │  │
│  │  │  Layer 2: OPERATOR RECONCILIATION                             │    │  │
│  │  │  Secret checks, connectivity, cross-resource, semantic rules  │    │  │
│  │  │                                                               │    │  │
│  │  │  ┌───────────────────────────────────────────────────────┐    │    │  │
│  │  │  │                                                       │    │    │  │
│  │  │  │  Layer 1: API SERVER                                  │    │    │  │
│  │  │  │  OpenAPI schema, types, required fields, enums        │    │    │  │
│  │  │  │                                                       │    │    │  │
│  │  │  └───────────────────────────────────────────────────────┘    │    │  │
│  │  │                                                               │    │  │
│  │  └───────────────────────────────────────────────────────────────┘    │  │
│  │                                                                       │  │
│  └───────────────────────────────────────────────────────────────────────┘  │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Layer 1: API Server

When a user or GitOps tool applies a Service CR, the Kubernetes API server validates it against the CRD's OpenAPI v3.0 schema before persisting it to etcd. This is the fastest feedback loop — invalid resources are rejected immediately.

What Layer 1 catches:

CheckExample
Type validationreplicas: "three" rejected (expected integer)
Required fieldsMissing spec.image.repository rejected
Enum constraintscache.backend: redis rejected if not in allowed values
Format validationspec.database.port: 99999 rejected (out of range)
Structural rulesUnknown fields rejected (with x-kubernetes-preserve-unknown-fields: false)
Policy override structurepolicyOverrides validated as correct shape (rules as map, configMapRef as object reference)
Policy XValidationCEL rule ensures at least rules or configMapRef is set when policyOverrides is present

Error reporting: Immediate rejection by kubectl apply or the API client:

text
$ kubectl apply -f nova.yaml
The Nova "nova" is invalid:
  spec.database.port: Invalid value: 99999: spec.database.port in body
  should be less than or equal to 65535

CEL validation rules (not yet implemented in C5C3). Kubernetes supports Common Expression Language (CEL) rules in CRDs (stable since 1.29) for cross-field validation. Examples that could be expressed:

  • If storage.backend is rbd, then storage.rbdPool must be set
  • If replicas.api is greater than 1, then cache.backend should be set (warn)
  • apiDatabase is required only for Nova (not other services)

Layer 2: Operator Reconciliation

After a CR passes API server validation, the Service Operator performs deeper semantic validation during reconciliation. These checks require runtime context that the API server schema cannot express.

What Layer 2 catches:

CheckDescription
Secret existenceReferenced K8s Secrets must exist and contain expected keys (see Secret Management)
Connectivity validationDatabase host must be resolvable, RabbitMQ endpoint must be reachable
Cross-resource validationKeystone must be Ready before Nova can configure auth
Dependency readinessInfrastructure services (MariaDB, RabbitMQ, Valkey) must be operational
Semantic checksCeph pool name must match the configured Ceph cluster, OVN connection format must be valid
Policy ConfigMap existenceIf policyOverrides.configMapRef is set, the referenced ConfigMap must exist and contain a policy.yaml key
Policy YAML validityThe policy.yaml content must be parseable YAML with a flat string → string mapping

Status conditions: Operators report validation results as conditions on the Service CR:

ConditionStatusReasonDescription
DatabaseReadyTrue/FalseSecretFound / SecretMissingDatabase credentials available and connection verified
MessagingReadyTrue/FalseConnected / ConnectionFailedRabbitMQ endpoint reachable
KeystoneAuthReadyTrue/FalseCredentialValid / CredentialExpiredApplication credential valid
CephConnectedTrue/FalsePoolAccessible / PoolNotFoundCeph RBD pool accessible
OVNConnectedTrue/FalseConnected / UnreachableOVN NB/SB database reachable
ConfigReadyTrue/FalseRendered / DependencyNotMet / PolicyConfigMapMissingConfiguration successfully rendered (includes policy.yaml if policyOverrides set)
ReadyTrue/FalseAllChecksPass / ConfigErrorOverall service readiness

Error reporting: Conditions are visible via kubectl describe and can be monitored by Prometheus:

text
$ kubectl describe nova nova -n openstack
Status:
  Conditions:
    Type:    DatabaseReady
    Status:  False
    Reason:  SecretMissing
    Message: Secret "nova-db-credentials" not found in namespace "openstack"

    Type:    ConfigReady
    Status:  False
    Reason:  DependencyNotMet
    Message: Cannot render config: database credentials unavailable

The operator re-checks on every reconciliation cycle. Once the missing secret appears (e.g., ESO syncs from OpenBao), the condition transitions to True and config generation proceeds.

Layer 3: Runtime

After the ConfigMap is mounted and the pod starts, OpenStack's oslo.config library parses the INI file. This is the final validation layer — it catches issues that only manifest at service startup.

What Layer 3 catches:

CheckDescription
Unknown optionsoslo.config warns about unrecognized config keys (logged, not fatal)
Deprecated optionsoslo.config logs deprecation warnings for renamed/removed options
Connection failuresDatabase or messaging connection fails after config is parsed
Permission errorsKeystone auth fails due to expired/invalid credentials
Missing dependenciesRequired Python modules not available for configured backend
Policy rule syntaxoslo.policy validates rule definitions at service startup (e.g., invalid check strings, unknown roles)

Error reporting: Pod logs and Kubernetes events:

text
$ kubectl logs nova-api-7f8b9c-x4k2j -n openstack
ERROR oslo_db.sqlalchemy.engines [-] Database connection failed:
  OperationalError: (pymysql.err.OperationalError)
  (2003, "Can't connect to MySQL server on 'maxscale.mariadb-system.svc'")

CrashLoopBackOff detection: If oslo.config parsing fails fatally (e.g., missing required option, invalid value type), the pod exits with a non-zero code. Kubernetes restarts it, and after repeated failures, the pod enters CrashLoopBackOff. This is visible via:

  • kubectl get pods — pod status shows CrashLoopBackOff
  • Readiness probe fails — Service endpoints are not updated
  • Operator conditions remain Ready: False

oslo-config-validator Integration (Design Concept)

OpenStack provides oslo-config-validator, a tool that validates a config file against the service's registered oslo.config options and their metadata (types, ranges, deprecated names). This can catch errors before the service attempts to start.

Concept: Init container approach

text
┌──────────────────────────────────────────────────────────────────────┐
│  Pod                                                                 │
│                                                                      │
│  ┌──────────────────────────────────┐                                │
│  │  Init Container: config-validator│                                │
│  │  Command: oslo-config-validator  │                                │
│  │    --config-file /etc/nova/      │                                │
│  │      nova.conf                   │                                │
│  │                                  │                                │
│  │  Exit 0 → proceed to main        │                                │
│  │  Exit 1 → pod fails, CRB         │                                │
│  └──────────────────┬───────────────┘                                │
│                     │                                                │
│                     ▼ (only if validator passes)                     │
│  ┌──────────────────────────────────┐                                │
│  │  Main Container: nova-api        │                                │
│  │  (starts normally)               │                                │
│  └──────────────────────────────────┘                                │
│                                                                      │
└──────────────────────────────────────────────────────────────────────┘

Trade-offs:

ProCon
Catches invalid/deprecated options before service startsAdds startup latency (oslo-config-validator must load all registered options)
Clear error messages in init container logsRequires the validator tool in the container image
Prevents CrashLoopBackOff from config errorsCannot validate connectivity (only syntax/schema)

Note: oslo-config-validator integration is a design concept. The current implementation relies on Layers 1-3 described above.

Validation Flow Timeline

The full validation timeline from kubectl apply to a running, healthy service:

text
kubectl apply


┌─────────────┐
│ API Server  │──▶ Schema violation? ──▶ REJECTED (immediate)
│ (Layer 1)   │
└──────┬──────┘
       │ CR persisted to etcd

┌─────────────┐
│ Operator    │──▶ Secret missing?    ──▶ Condition: DatabaseReady=False
│ Reconcile   │──▶ Dependency not met?──▶ Condition: ConfigReady=False
│ (Layer 2)   │──▶ All checks pass    ──▶ Render ConfigMap
└──────┬──────┘
       │ ConfigMap created, Deployment updated

┌─────────────┐
│ Pod Start   │──▶ oslo.config error? ──▶ CrashLoopBackOff (pod logs)
│ (Layer 3)   │──▶ Connection fail?   ──▶ Readiness probe fails
│             │──▶ All OK             ──▶ Ready
└──────┬──────┘


   Service healthy
   Condition: Ready=True

Error Reporting Summary

LayerWhenFeedback MechanismLatency
Layer 1: API ServerOn kubectl applyCLI error, API responseImmediate
Layer 2: OperatorDuring reconciliationCR status conditions, eventsSeconds
Layer 3: RuntimeOn pod startupPod logs, CrashLoopBackOff, readiness probesSeconds to minutes