Current behavior:
When Nova places a server in ERROR before kubelet registers, the corresponding Machine object may keep an empty .status:
k get machine -n kube-system 2xh100nvl-6978bf6c49-ccdtj -oyaml | yq .status
{}
The consumer components (e.g. kkp-api/dashboard) only see a provisioning Machine and the actual provider fault cannot surface unless the user has access to the underlying OpenStack project (not always the case):
os server show f472b2c8-fd17-4759-afaf-ef732776d9cc -c status -c fault
+--------+-------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+--------+-------------------------------------------------------------------------------------------------------------------------------+
| fault | {'code': 500, 'created': '2026-06-04T21:29:31Z', 'message': 'No valid host was found. There are not enough hosts available.'} |
| status | ERROR |
+--------+-------------------------------------------------------------------------------------------------------------------------------+
(... in this case, this error is due to a lack of hosts that can schedule such VM flavour - i.e. with 2xh100 GPUs.)
Expected behavior:
If the OpenStack server enters ERROR state, the machine-controller should extract the instance's fault information and persist it to Machine.status.errorReason and errorMessage, even when no Kubernetes Node exists yet - as it happens, for instance with quota exhaustion:
|
// This generally refers to exceeding one's quota in a cloud provider, |
|
// or running out of physical machines in an on-premise environment. |
|
InsufficientResourcesMachineError MachineStatusError = "InsufficientResources" |
The Dashboard can then surface provider-side provisioning failures such as "No valid host was found. There are not enough hosts available." instead of showing an endless provisioning loop:
Current behavior:
When Nova places a server in ERROR before kubelet registers, the corresponding Machine object may keep an empty
.status:k get machine -n kube-system 2xh100nvl-6978bf6c49-ccdtj -oyaml | yq .status {}The consumer components (e.g. kkp-api/dashboard) only see a provisioning
Machineand the actual provider fault cannot surface unless the user has access to the underlying OpenStack project (not always the case):(... in this case, this error is due to a lack of hosts that can schedule such VM flavour - i.e. with 2xh100 GPUs.)
Expected behavior:
If the OpenStack server enters
ERRORstate, the machine-controller should extract the instance'sfaultinformation and persist it toMachine.status.errorReasonanderrorMessage, even when no Kubernetes Node exists yet - as it happens, for instance with quota exhaustion:machine-controller/sdk/apis/machines/v1alpha1/types.go
Lines 215 to 217 in 3851738
The Dashboard can then surface provider-side provisioning failures such as "
No valid host was found. There are not enough hosts available." instead of showing an endless provisioning loop: