cluster-api-provider-openstack Prevent reconciliation of machines with invalid credentials

Prevent reconciliation of machines with invalid credentials

Open mdbooth opened this issue 2 years ago • 6 comments

/kind bug

A 'fat finger' bug we have seen more than once in the wild now involves the user accidentally updating cloud credentials with valid credentials for the wrong cloud. In at least one case it was a staging environment. In another case the credentials referred to the wrong region.

In both cases we ended up marking the Machines as failed because the cloud we connected to reported that the underlying resources don't exist.

While this is user error, the user impact is significant (all machines become failed) and there should be some way for us to detect it. I propose adding metadata to OpenStackMachine.Status which uniquely and unambiguously identifies the cloud the Machine was originally created in. If they don't match we would refuse to reconcile the machine at all, but not mark it failed.

I believe project id would be a good fit for this, although it may need to be combined with region.

Proposed for 0.6: #1145

Feb 23 '22 12:02 mdbooth

I guess it's a requirement instead of a bug since it's because user invalid input and CAPO just take actions according and I agree we should fix it nevertheless

I think the issue you reported is

we create cloud.conf and create cluster then everything fine
the cloud.conf got updated due to any reason but wrong configuration input
then next reconcile makes the compute resource become fail

OpenStackMachine.Status should be able to identify the machine, what about other resources e.g sec group , LB etc? will that be impacted? how about some staus recorded in cluster level so we can compare this at high level instead ?

Feb 24 '22 01:02 jichenjc

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

May 25 '22 01:05 k8s-triage-robot

/remove-lifecycle stale /kind feature

Jun 15 '22 13:06 apricote

/remove-kind bug

Jun 15 '22 13:06 seanschneeweiss

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Sep 13 '22 14:09 k8s-triage-robot

/remove-lifecycle stale

Sep 13 '22 23:09 jichenjc

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Dec 13 '22 00:12 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Jan 12 '23 01:01 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Feb 11 '23 02:02 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen

Mark this issue as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Feb 11 '23 02:02 k8s-ci-robot

cluster-api-provider-openstack cluster-api-provider-openstack copied to clipboard

Prevent reconciliation of machines with invalid credentials

cluster-api-provider-openstack
cluster-api-provider-openstack copied to clipboard