cluster-api-provider-openstack
cluster-api-provider-openstack copied to clipboard
Tracking issue for improving resources status in CAPO
This is a tracking issue for CAPO-related effort to improve resources status.
High level required changes with the new CAPI contract
Most of these changes will be required in the v1beta2 API contract (tentative Apr 2025).
OpenStackCluster
Following changes are planned for the contract for the OpenStackCluster resource:
- Disambiguate the usage of the ready term by renaming fields used for the initial provisioning workflow
- Rename status.ready into status.initialization.provisioned.
- Remove failureReason and failureMessage.
Notes:
- OpenStackCluster's
status.initialization.provisionedwill surface intoCluster's status.initialization.infrastructureProvisionedfield. - OpenStackCluster's
status.initialization.provisionedmust signal the completion of the initial provisioning of the cluster infrastructure. The value of this field should never be updated after provisioning is completed, and Cluster API will ignore any changes to it. - OpenStackCluster's
status.conditions[Ready]will surface into Machine'sstatus.conditions[InfrastructureReady]condition. - OpenStackCluster's
status.conditions[Ready]must surface issues during the entire lifecycle of theOpenStackCluster(both during initialOpenStackClusterprovisioning and after the initial provisioning is completed).
OpenStackMachine
Following changes are planned for the contract for the OpenStackMachine resource:
- Disambiguate the usage of the ready term by renaming fields used for the initial provisioning workflow
- Rename status.ready into status.initialization.provisioned.
- Remove failureReason and failureMessage.
Notes:
- OpenStackMachine's
status.initialization.provisionedwill surface intoMachine's status.initialization.infrastructureProvisionedfield. - OpenStackMachine's
status.initialization.provisionedmust signal the completion of the initial provisioning of the cluster infrastructure. The value of this field should never be updated after provisioning is completed, and Cluster API will ignore any changes to it. - OpenStackMachine's
status.conditions[Ready]will surface into Cluster'sstatus.conditions[InfrastructureReady]condition. - OpenStackMachine's
status.conditions[Ready]must surface issues during the entire lifecycle of theMachine(both during initialOpenStackMachineprovisioning and after the initial provisioning is completed).
Notes on Conditions
Some remarks about Kubernetes API conventions in regard to conditions:
- Polarity: Condition type names should make sense for humans; neither positive nor negative polarity can be recommended as a general rule
- Use of the
Reasonfield is required (currently in Cluster API reasons is added only when condition are false) - Controllers should apply their conditions to a resource the first time they visit the resource, even if the status is
Unknown. (currently Cluster API controllers add conditions at different stages of the reconcile loops). Please note that:- If more than one controller adds conditions to the same resources, conditions managed by the different controllers will be applied at different times.
- Kubernetes API conventions account for exceptions to this rule; for known conditions, the absence of a condition status should
be interpreted the same as
Unknown, and typically indicates that reconciliation has not yet finished.
- We'll be using
metav1.Conditionsfrom the Kubernetes API.
Terminal Failures
By getting rid of the terminal failures, we have an opportunity to improve CAPO's reliability to handle OpenStack infrastructure failures, such as API rate limits or temporary unavailability which unfortunately happen often in large-scale production clouds. We'll need to investigate what these failures can be, and how we threat them:
- CAPO continues to reconcile the resource and update conditions with a temporary state
- CAPO stops reconciling the resource and update conditions to an human readable error message
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale