cluster-api-provider-openstack icon indicating copy to clipboard operation
cluster-api-provider-openstack copied to clipboard

🐛 Fix OpenStackServer reconciliation stuck when cluster is unpaused

Open bnallapeta opened this issue 1 month ago • 6 comments

What this PR does / why we need it: When a cluster is paused (e.g., during a pivot operation), OpenStackServer resources stop reconciling. However, when the cluster is unpaused, they don't resume because the controller doesn't watch for cluster pause/unpause events.

This PR adds a watch on Cluster resources so OpenStackServers are re-queued when their parent cluster transitions from paused to unpaused state.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged): Fixes #2824

TODOs:

  • [x] squashed commits
  • if necessary:
    • [ ] includes documentation
    • [x] adds unit tests

/hold

bnallapeta avatar Nov 13 '25 11:11 bnallapeta

Deploy Preview for kubernetes-sigs-cluster-api-openstack ready!

Name Link
Latest commit 828af4f1b69bd6dbd36bab3e2cf0899a48542cb8
Latest deploy log https://app.netlify.com/projects/kubernetes-sigs-cluster-api-openstack/deploys/69267f2f3e792900087e4539
Deploy Preview https://deploy-preview-2833--kubernetes-sigs-cluster-api-openstack.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

netlify[bot] avatar Nov 13 '25 11:11 netlify[bot]

/approve

EmilienM avatar Nov 19 '25 14:11 EmilienM

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: EmilienM

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Nov 19 '25 14:11 k8s-ci-robot

/ok-to-test

bnallapeta avatar Nov 20 '25 01:11 bnallapeta

I am a bit worried that this will break clusterctl move. We will probably need to handle all these cases:

  1. Cluster paused and has deletion timestamp: do nothing <- This is now broken I think (we delete even when paused)
  2. Cluster unpaused and has deletion timestamp: reconcileDelete <- This was missing in first version I think
  3. Cluster paused and no deletion timestamp: do nothing :heavy_check_mark:
  4. Cluster unpaused and no deletion timestamp: reconcileNormal :heavy_check_mark:

The full test is testing clusterctl move, so let's check /test pull-cluster-api-provider-openstack-e2e-full-test

lentzi90 avatar Dec 11 '25 09:12 lentzi90

@bnallapeta: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cluster-api-provider-openstack-e2e-full-test 828af4f1b69bd6dbd36bab3e2cf0899a48542cb8 link false /test pull-cluster-api-provider-openstack-e2e-full-test

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

k8s-ci-robot avatar Dec 11 '25 10:12 k8s-ci-robot