cluster-api-provider-openstack icon indicating copy to clipboard operation
cluster-api-provider-openstack copied to clipboard

[WIP] ✨load balancers: delete orphaned load balancers from same network

Open seanschneeweiss opened this issue 4 years ago • 7 comments

What this PR does / why we need it:

A cluster deletion also deletes the CCM. Potentially, load balancers managed by CCM remain in the cluster's network. This blocks network deletion. This change deletes all remaining load balancers running in the network defined by OpenStackCluster.Status.Network.ID.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):

Fixes #842.

Special notes for your reviewer:

This PR also adds some error wrapping, but only for funcs that are related to this PR. If feasible, I could adapt other errors.Errorf to errors.Wrap in a separate PR.

  1. Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

TODOs:

  • [x] squashed commits
  • if necessary:
    • [ ] includes documentation
    • [ ] adds unit tests

/hold Sean Schneeweiss [email protected], Daimler TSS GmbH, Provider Information

seanschneeweiss avatar Sep 08 '21 11:09 seanschneeweiss

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: seanschneeweiss To complete the pull request process, please assign hidekazuna after the PR has been reviewed. You can assign the PR to them by writing /assign @hidekazuna in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Sep 08 '21 11:09 k8s-ci-robot

Does this have to be load-balancers on the same network? Can we not just delete load-balancers that match the naming convention with the cluster's name substituted?

By default, load-balancers are made with the name kube_service_<clustername>_<namespace>_<servicename>.

An alternative approach, which would not be implemented in this provider but probably in the core provider, would be to delete all services of type LoadBalancer prior to deleting the cluster. This would also remove the OpenStack load-balancers, but would also work for all the public clouds as well.

A similar approach could be taken for Cinder volumes created using the Cinder CSI (which are also orphaned currently when a cluster is deleted). The generic approach here would be to delete all PVCs before deleting the cluster, which would also work with all other storage providers. Of course, in this case an opt-out (or explicit opt-in) is more important to avoid potential data loss.

mkjpryor avatar Sep 15 '21 10:09 mkjpryor

I haven't yet reviewed this properly, but this sounds like something we could cover with an E2E test? For example, in the existing CCM create/delete test, could we provision something in the cluster which creates an LB, I'm guessing a Service? Then we could test that we correctly clean it up.

mdbooth avatar Sep 24 '21 10:09 mdbooth

@seanschneeweiss: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Jan 13 '22 18:01 k8s-ci-robot

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Apr 13 '22 19:04 k8s-triage-robot

/remove-lifecycle stale

seanschneeweiss avatar Apr 19 '22 22:04 seanschneeweiss

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jul 18 '22 23:07 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Aug 21 '22 09:08 k8s-triage-robot

/remove-lifecycle stale

Will rebase and continue to work on this after summer holidays.

seanschneeweiss avatar Aug 31 '22 15:08 seanschneeweiss

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Reopen this PR with /reopen
  • Mark this PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-triage-robot avatar Sep 30 '22 15:09 k8s-triage-robot

@k8s-triage-robot: Closed this PR.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Reopen this PR with /reopen
  • Mark this PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Sep 30 '22 15:09 k8s-ci-robot

@seanschneeweiss do you think you might have sometime to help get this landed?

mnaser avatar Oct 26 '22 13:10 mnaser

/remove-lifecycle rotten /reopen Happy to land this :)

seanschneeweiss avatar Oct 28 '22 12:10 seanschneeweiss

@seanschneeweiss: Reopened this PR.

In response to this:

/remove-lifecycle rotten /reopen Happy to land this :)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Oct 28 '22 12:10 k8s-ci-robot

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: seanschneeweiss

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • ~~OWNERS~~ [seanschneeweiss]

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Oct 28 '22 12:10 k8s-ci-robot

@seanschneeweiss: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cluster-api-provider-openstack-test 624c50625a55661e523c1109dd0983ac181a38f0 link true /test pull-cluster-api-provider-openstack-test
pull-cluster-api-provider-openstack-build 624c50625a55661e523c1109dd0983ac181a38f0 link true /test pull-cluster-api-provider-openstack-build
pull-cluster-api-provider-openstack-e2e-test 624c50625a55661e523c1109dd0983ac181a38f0 link true /test pull-cluster-api-provider-openstack-e2e-test

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

k8s-ci-robot avatar Oct 28 '22 12:10 k8s-ci-robot

@seanschneeweiss fyi, openstack magnum deals with this similarly, so this might help you with the logic:

https://github.com/openstack/magnum/blob/0ee8abeed0ab90baee98a92cab7c684313bab906/magnum/common/octavia.py#L75

mnaser avatar Oct 28 '22 13:10 mnaser

I think this would be a good addition to CAPO. Generally I agree with what has been said in previous comments. since CAPO already deletes networks that has been created I se no problem deleting load balancers on the cluster network.

Just to make sure that everyone using CAPO understands this feature a few lines added to the documentation might be useful.

huxcrux avatar Dec 01 '22 03:12 huxcrux

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Mar 01 '23 03:03 k8s-triage-robot

@seanschneeweiss: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cluster-api-provider-openstack-test 624c50625a55661e523c1109dd0983ac181a38f0 link true /test pull-cluster-api-provider-openstack-test
pull-cluster-api-provider-openstack-build 624c50625a55661e523c1109dd0983ac181a38f0 link true /test pull-cluster-api-provider-openstack-build
pull-cluster-api-provider-openstack-e2e-test 624c50625a55661e523c1109dd0983ac181a38f0 link true /test pull-cluster-api-provider-openstack-e2e-test

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

k8s-ci-robot avatar Mar 27 '23 14:03 k8s-ci-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle rotten
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Apr 26 '23 15:04 k8s-triage-robot

/remove-lifecycle rotten

mdbooth avatar Apr 27 '23 14:04 mdbooth

I'm closing this PR as the parent issue was closed due to inactivity and there might be other resources that should be deleted as well. Not only load balancers. This unmaintained repository might help as reference: https://github.com/giantswarm/cluster-api-cleaner-openstack

seanschneeweiss avatar Dec 16 '23 23:12 seanschneeweiss