Hypervisor Lifecycle
Complete State Diagram
The following diagram shows all states of a hypervisor node and the transitions between them. Each transition is triggered by a specific trigger and processed by a specific controller. For the agent and controller architecture, see Component Interaction.
┌─────────┐
│ Initial │
└────┬────┘
│ Trigger: New K8s Node detected
│ Controller: Hypervisor Controller
▼
┌─────────────────────┐
┌───▶│ Onboarding │───────┐
│ └──────────┬──────────┘ │
│ │ │ Error during configuration
│ │ Trigger: │ Controller: Onboarding Controller
│ │ Configuration │
│ │ completed ▼
│ │ ┌──────────┐
│ │ │ Aborted │
│ ▼ └──────────┘
│ ┌───────────┐ ▲
│ │ Testing │───────────┘
│ └─────┬─────┘ Error during validation
│ │ Controller: Onboarding Controller
│ │
│ │ Trigger: Tests passed
│ │ (or skipTests: true)
│ ▼
│ ┌───────────┐
│ │ Ready │
│ └─────┬─────┘
│ │ Trigger: Nova Scheduling enabled
│ │ Controller: Hypervisor Controller
│ ▼
Re-Enable │ ┌─────────────────────┐
after maint. ───┘ │ Active │◀────────────────────────┐
└──┬──────────┬───────┘ │
│ │ │
Trigger: Admin/ │ │ Trigger: Node-Failure/ │
Gardener Rolling │ │ Admin/Gardener Rolling Update │
Update │ │ Controller: Eviction Controller │
Controller: │ │ │
Hypervisor Ctrl │ ▼ │
│ ┌─────────────┐ │
│ │ Evicting │ │
│ └──────┬──────┘ │
│ │ │
│ ├── Trigger: Eviction Complete │
│ │ (planned maintenance) │
│ │ Controller: Eviction Ctrl ────┘
│ │
▼ ▼ Trigger: Eviction Complete
┌─────────────┐ (Decommission)
│ Maintenance │ Controller: Eviction Controller
└──────┬──────┘ │
│ ▼
│ ┌──────────────────┐
│ │ Decommissioned │
│ └──────────────────┘
│
│ Trigger: Maintenance completed,
│ Re-Enable
│ Controller: Hypervisor Controller
└──────────▶ (back to Active)State Overview:
| State | Description | Responsible Controller |
|---|---|---|
| Initial | Newly discovered K8s Node, Hypervisor CRD is created | Hypervisor Controller |
| Onboarding | Configuration and integration into OpenStack | Onboarding Controller |
| Testing | Validation checks running | Onboarding Controller |
| Aborted | Onboarding/Testing failed | Onboarding Controller |
| Ready | Node is ready, waiting for scheduling approval | Hypervisor Controller |
| Active | Node actively hosts VMs | Hypervisor Controller |
| Maintenance | Planned maintenance, VMs evacuated | Hypervisor Controller |
| Evicting | VM evacuation in progress | Eviction Controller |
| Decommissioned | Node decommissioned | Hypervisor Controller |
Onboarding Process
┌─────────┐ ┌───────────┐ ┌─────────┐ ┌───────┐ ┌───────┐
│ Initial │────▶│ Onboarding│────▶│ Testing │────▶│ Ready │────▶│ Active│
└─────────┘ └───────────┘ └─────────┘ └───────┘ └───────┘
│ │
│ │
▼ ▼
┌─────────┐ ┌─────────┐
│ Aborted │ │ Aborted │
└─────────┘ └─────────┘Phase "Initial":
The Hypervisor Controller watches the Kubernetes Node API in the Hypervisor Cluster. When a new node joins the cluster, the controller automatically creates a Hypervisor CRD (see CRDs) with the initial state. The CRD contains the node reference and the desired configuration.
Phase "Onboarding":
The Onboarding Controller takes over configuration of the new node:
- Nova Host Aggregate: Node is assigned to the configured Host Aggregate in Nova
- Traits Sync: OpenStack Traits (CPU capabilities, hardware features) are synchronized from the Hypervisor CRD into Nova
- OpenStack API Registration: Node is registered as Compute Host in Nova, Service-ID and Hypervisor-ID are stored in the CRD
Phase "Testing":
After onboarding, automatic validation checks are performed:
- LibVirt Connectivity: Check if the LibVirt daemon is reachable via TCP (Connection URI
qemu+tcp://<host>:16509/systemorch+tcp://<host>:16509/systemdepending on the backend), if VMs can be started, and if the LibVirt version meets the expected minimum version. The result is documented in the Hypervisor CRD status (libVirtVersion). - OVS Bridges: Verification that br-int, br-ex (and optionally br-provider) are correctly configured
- Ceph RBD Access: Test if the node can access the Ceph RBD pools (block storage for VM disks)
Note: With
skipTests: truein the Hypervisor CRD Spec, the Testing phase can be skipped. This is useful for development environments, not recommended for production.
Phase "Ready -> Active":
Once all tests pass, the node is marked as Ready. The Hypervisor Controller enables the node for Nova scheduling --- from this point, VMs can be placed on the node and the state changes to Active.
Error Handling:
The Aborted state is set when:
- OpenStack API registration fails (e.g., Keystone authentication fails)
- Host Aggregate assignment is not possible (e.g., aggregate does not exist)
- One or more validation checks fail in the Testing phase
With Aborted, the cause must be fixed and the onboarding process manually restarted by resetting the Hypervisor CRD status.
Maintenance Mode
Maintenance mode enables planned maintenance work on a hypervisor node without permanently losing VMs.
Triggers:
| Trigger | Initiator | Maintenance Mode |
|---|---|---|
| Manual (Admin) | Admin sets spec.maintenance: "manual" in the Hypervisor CRD | manual |
| Automatic (Gardener) | Gardener Rolling Update requires node restart | auto |
| HA Event | HA Agent detects hardware problem (see High Availability) | ha |
| Node Termination | Kubernetes Node is terminated | termination |
Flow:
┌──────────┐ ┌───────────────┐ ┌──────────────┐ ┌─────────────┐ ┌──────────┐
│ Active │───▶│ Scheduling │───▶│ VM │───▶│ Maintenance │───▶│ Active │
│ │ │ Stop │ │ Evacuation │ │ (Maint.) │ │ │
└──────────┘ └───────────────┘ └──────────────┘ └─────────────┘ └──────────┘- Scheduling Stop: Hypervisor is disabled in Nova, no new VMs are placed
- VM Evacuation: All VMs are migrated via live migration to other hypervisors (identical to eviction process)
- Maintenance: Node is free for maintenance work (OS update, firmware update, hardware replacement)
- Re-Enable: After maintenance is complete, the hypervisor is re-activated in Nova and returns to
Activestate
Difference to Eviction: Maintenance mode is planned and controlled. Return to active operation is expected. With eviction due to node failure, return is not guaranteed.
Eviction Process
┌────────────────────┐
│ Eviction Request │
│ (CRD created) │
└─────────┬──────────┘
│
▼
┌────────────────────┐
│ Preflight Checks │
│ - OS Validation │
│ - Resource Check │
└─────────┬──────────┘
│
▼
┌────────────────────┐
│ Hypervisor Disable │
│ (Scheduling Stop) │
└─────────┬──────────┘
│
▼
┌────────────────────┐
│ VM Migration │◀──────┐
│ (Instance by │ │
│ Instance) │───────┘ (Loop until all migrated)
└─────────┬──────────┘
│
▼
┌────────────────────┐
│ Eviction │
│ Complete │
└─────────┬──────────┘
│
▼
┌────────────────────┐
│ Hypervisor Re- │ (optional, based on reason)
│ Enable │
└────────────────────┘Eviction Reasons:
| Reason | Initiator | Re-Enable after Eviction |
|---|---|---|
| Node Failure | HA Agent detects via LibVirt Events, automatically creates Eviction CRD (see High Availability) | No (node must be restored first) |
| Gardener Rolling Update | Gardener sets Node Drain, Operator creates Eviction CRD | Yes (after successful update) |
| Admin Request | Admin manually creates an Eviction CRD | Depends on reason |
| Decommissioning | Admin marks node for decommissioning | No |
Preflight Checks:
Before eviction begins, the Eviction Controller checks:
- Target Capacity: Enough free resources (RAM, vCPUs) on other hypervisors for all VMs
- Nova Scheduling Filter: Check if Nova Scheduler filters (Availability Zone, Host Aggregates, Traits) allow placement
- VM Compatibility: Check for VMs that cannot be migrated (e.g., PCI passthrough, local disks)
The eviction process is aborted if preflight checks fail. The Eviction CRD is updated with an appropriate condition (see CRDs).
VM Migration:
Migration occurs instance by instance via the Nova Live Migration API:
- Hypervisor Operator selects the next VM to migrate
- Nova Live Migration API is called (target host is determined by Nova Scheduler)
- Migration CRD is created and monitored (see CRDs)
- After successful migration: next VM, until all are migrated
Timeout Handling:
If a single VM migration does not complete within the configured timeout:
- Migration is marked as
stuck - Eviction Controller attempts to cancel the migration and restart it
- On repeated failure: VM is marked as non-migratable, eviction continues with remaining VMs
- The Eviction CRD documents all results via Conditions
Completion:
After eviction completes:
- All VMs have been successfully migrated (or marked as non-migratable)
- Hypervisor is disabled in Nova (
disabled) - For planned maintenance: Re-enable after maintenance completes
- For decommissioning: No re-enable, proceed to decommissioning process
Decommissioning
After complete eviction, a node can be permanently decommissioned:
- Remove Nova Host Aggregate: Node is removed from all Host Aggregates in Nova
- Clean up Traits: OpenStack Traits for this hypervisor are deleted
- Remove Nova Compute Service: Compute Service registration is deleted from Nova
- CRD Status: Hypervisor CRD is set to
Decommissioned
The node can subsequently be removed from the Kubernetes cluster. The Hypervisor CRD remains as documentation until manually deleted.