Skip to content

Cortex Scheduling

Repository: github.com/cobaltcore-dev/cortex

Cortex is a Kubernetes-native, modular scheduler for multi-domain resource placement. It implements the External Scheduler Delegation Pattern, where Nova delegates host ranking to Cortex after its own filtering phase, enabling placement decisions based on topology, thermal data, and custom constraints.

Supported Scheduling Domains

DomainDescriptionAPI Endpoint
NovaVM placement (KVM, Cloud Hypervisor & VMware)POST /scheduler/nova/external
CinderBlock storage volumesPOST /scheduler/cinder/external
ManilaShared filesystemsPOST /scheduler/manila/external
IronCore MachinesBare-metal machinesK8s CRD-based
Kubernetes PodsNative pod schedulingK8s CRD-based

Scheduler Delegation with Nova

text
┌───────────────────────────────────────────────────────────────────┐
│                        Nova Scheduler                             │
├───────────────────────────────────────────────────────────────────┤
│                                                                   │
│  1. FILTERING PHASE                                               │
│     ┌───────────────────────────────────────────────────────┐     │
│     │ Retrieve all compute hosts                            │     │
│     │ Filter: RAM, CPU, Disk, Traits, Availability Zones    │     │
│     │ Result: List of possible hosts                        │     │
│     └───────────────────────────────────────────────────────┘     │
│                               │                                   │
│                               ▼                                   │
│  2. WEIGHING PHASE                                                │
│     ┌───────────────────────────────────────────────────────┐     │
│     │ Ranking based on:                                     │     │
│     │ - Resource Utilization                                │     │
│     │ - Configured Weights                                  │     │
│     │ Result: Weighted host list                            │     │
│     └───────────────────────────────────────────────────────┘     │
│                               │                                   │
│                               ▼                                   │
│  3. CORTEX DELEGATION (CobaltCore Extension)                      │
│     ┌───────────────────────────────────────────────────────┐     │
│     │         ┌──────────────────────────┐                  │     │
│     │         │        CORTEX            │                  │     │
│     │         │                          │                  │     │
│     │         │  Knowledge Database      │                  │     │
│     │ Hosts ─▶│  - Topology              │─▶ Optimized      │     │
│     │ Weights │  - Thermal Data          │   Host List      │     │
│     │         │  - Network Proximity     │                  │     │
│     │         │  - Failure Domains       │                  │     │
│     │         │  - Custom Constraints    │                  │     │
│     │         └──────────────────────────┘                  │     │
│     └───────────────────────────────────────────────────────┘     │
│                               │                                   │
│                               ▼                                   │
│  4. SCHEDULING PHASE                                              │
│     ┌───────────────────────────────────────────────────────┐     │
│     │ Place VM on highest ranked host                       │     │
│     │ On failure: Try next host                             │     │
│     └───────────────────────────────────────────────────────┘     │
│                                                                   │
└───────────────────────────────────────────────────────────────────┘

For the Hypervisor CRD that represents compute hosts, see CRDs. For VM placement in the context of eviction and migration, see Hypervisor Lifecycle.

Cortex Architecture Overview

text
┌─────────────────────────────────────────────────────────────────────────┐
│                       CORTEX (Control Plane Cluster)                    │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  ┌───────────────────────────────────────────────────────────────────┐  │
│  │                     DATA ARCHITECTURE (3-Tier)                    │  │
│  │                                                                   │  │
│  │  DATASOURCES           KNOWLEDGES           KPIs                  │  │
│  │  (Raw Data Sync)       (Feature Extract)    (Prometheus)          │  │
│  │                                                                   │  │
│  │  ┌────────────┐       ┌────────────┐       ┌────────────┐         │  │
│  │  │ OpenStack  │──────▶│ Extracted  │──────▶│ Prometheus │         │  │
│  │  │ - Nova     │       │ Features   │       │ Metrics    │         │  │
│  │  │ - Placement│       │            │       │            │         │  │
│  │  │ - Cinder   │       │ Stored in  │       │ Dynamically│         │  │
│  │  │ - Manila   │       │ CRD Status │       │ generated  │         │  │
│  │  │ - Identity │       └────────────┘       └────────────┘         │  │
│  │  ├────────────┤                                                   │  │
│  │  │ Prometheus │       ┌────────────┐                              │  │
│  │  │ (Metrics)  │──────▶│ PostgreSQL │ ◀── Persistent Storage       │  │
│  │  └────────────┘       └────────────┘                              │  │
│  └───────────────────────────────────────────────────────────────────┘  │
│                                    │                                    │
│                                    ▼                                    │
│  ┌───────────────────────────────────────────────────────────────────┐  │
│  │                        PIPELINE ENGINE                            │  │
│  │                                                                   │  │
│  │  ┌─────────────────────────┐  ┌─────────────────────────┐         │  │
│  │  │ FILTER-WEIGHER PIPELINE │  │ DETECTOR PIPELINE       │         │  │
│  │  │ (Initial Placement)     │  │ (Descheduling)          │         │  │
│  │  │                         │  │                         │         │  │
│  │  │ 1. Filters → Remove     │  │ 1. Detects problem      │         │  │
│  │  │ 2. Weighers → Score     │  │ 2. Creates Descheduling │         │  │
│  │  │ 3. Activation → Combine │  │ 3. Triggers Migration   │         │  │
│  │  │ 4. → Decision CRD       │  │                         │         │  │
│  │  └─────────────────────────┘  └─────────────────────────┘         │  │
│  └───────────────────────────────────────────────────────────────────┘  │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

Cortex Custom Resource Definitions (CRDs)

Cortex defines 7 CRD types in API group cortex.c5c3.io/v1alpha1:

Data Collection

CRDDescription
DatasourceConfiguration of external data sources (OpenStack APIs, Prometheus)
KnowledgeExtracted features from Datasources (stored in CRD status)
KPIPrometheus metrics generated from Knowledge data

Scheduling

CRDDescription
PipelineDefinition of Filter-Weigher or Detector pipelines
DecisionScheduling decision with history and explanation
DeschedulingMigration instruction for problematic VMs
ReservationCapacity reservation for future workloads

For the core CobaltCore CRDs (Hypervisor, Eviction, Migration, etc.), see CRDs.

Example: Pipeline CRD

yaml
apiVersion: cortex.c5c3.io/v1alpha1
kind: Pipeline
metadata:
  name: nova-kvm-pipeline
spec:
  type: filter-weigher
  schedulingDomain: nova
  createDecisions: true
  filters:
    - name: filter_has_enough_capacity
      params: {}
    - name: filter_correct_az
      params: {}
    - name: filter_instance_group_affinity
      params: {}
  weighers:
    - name: weigher_general_purpose_balancing
      multiplier: 1.0
    - name: weigher_avoid_long_term_contended_hosts
      multiplier: 1.5
status:
  conditions:
    - type: Ready
      status: "True"
    - type: AllStepsReady
      status: "True"

Example: Datasource CRD (OpenStack)

yaml
apiVersion: cortex.c5c3.io/v1alpha1
kind: Datasource
metadata:
  name: nova-hypervisors
spec:
  type: openstack
  schedulingDomain: nova
  openstack:
    type: nova
    syncInterval: "60s"
status:
  lastSynced: "2024-01-15T10:30:00Z"
  conditions:
    - type: Ready
      status: "True"

Available Filter Plugins (Nova)

PluginDescription
filter_allowed_projectsEnforces project quotas
filter_capabilitiesMatches image properties
filter_correct_azZone/Aggregate constraints
filter_external_customerExternal customer restrictions
filter_has_acceleratorsGPU/Accelerator requirements
filter_has_enough_capacityResource availability
filter_has_requested_traitsPlacement traits
filter_host_instructionsForce/Ignore host lists
filter_instance_group_affinityVM affinity rules

Available Weigher Plugins (VMware)

PluginDescription
vmware_hana_binpackingTight packing for SAP HANA
vmware_general_purpose_balancingLoad balancing
vmware_avoid_long_term_contended_hostsAvoid chronic contention
vmware_avoid_short_term_contended_hostsAvoid temporary contention
vmware_anti_affinity_noisy_projectsNoisy neighbor isolation

CRD Relationships

text
┌───────────────────────────────────────────────────────────────────────┐
│                        Cortex CRD Relationships                       │
│                                                                       │
│   Datasource ────▶ Knowledge ────▶ KPI ────▶ Prometheus Metrics       │
│       │               │                                               │
│       │               │                                               │
│       ▼               ▼                                               │
│   PostgreSQL     CRD Status                                           │
│   (Raw Data)     (Features)                                           │
│                                                                       │
│                                                                       │
│   Pipeline ─────────────────────────▶ Decision                        │
│       │                                   │                           │
│       │ (type: filter-weigher)            ├── orderedHosts            │
│       │                                   ├── stepResults             │
│       │                                   ├── history                 │
│       │                                   └── explanation             │
│       │                                                               │
│       │ (type: detector)                                              │
│       │                                                               │
│       └─────────────────────────────▶ Descheduling                    │
│                                           │                           │
│                                           ├── prevHost                │
│                                           ├── newHost                 │
│                                           └── reason                  │
│                                                                       │
│   Reservation                                                         │
│       │                                                               │
│       ├── requests (memory, cpu)                                      │
│       ├── host (reserved)                                             │
│       └── phase (active/failed)                                       │
│                                                                       │
└───────────────────────────────────────────────────────────────────────┘