cluster-api-provider-aws Graduation of EventBridge Feature

This issue is to track graduation of the Eventbridge feature, which currently only watches instance state change events.

There are other type of CloudWatch events that can be watched and acted upon such as ASG Scale in events and Spot Instance Termination Notices.

Some use cases for these:

In the current AWSMachinePool implementation, we don't have draining logic and with the use of lifecyclehooks, we can act on the instance termination events and drain the instances during rollout or scale-in. (see graceful shutdown of MachinePools for details)
In the near future we are planning to add support for spot instances with AWSMachinePool and termination notices could give us enough time to drain spot instances before they shut down (notices give about 2 minutes in some scenarios)

In the scope of this effort, the existing implementation should be modified to accommodate different types of events and should be flexible enough to be used by several controllers. Also, for cases where events need to be exposed to the controllers (this will be the case probably for AWSMachinePool draining), some helper functions should make read/delete interaction with events possible.

Apr 13 '22 03:04 sedefsavas

@sedefsavas: This issue is currently awaiting triage.

If CAPA/CAPI contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Apr 13 '22 03:04 k8s-ci-robot

/milestone v1.5.0

Apr 13 '22 03:04 sedefsavas

/assign

Apr 13 '22 04:04 Ankitasw

/label adr-required

Apr 13 '22 07:04 Ankitasw

@Ankitasw: The label(s) /label adr-required cannot be applied. These labels are supported: api-review, tide/merge-method-merge, tide/merge-method-rebase, tide/merge-method-squash, team/katacoda, refactor

In response to this:

/label adr-required

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Apr 13 '22 07:04 k8s-ci-robot

I'd like to include another use case for this feature: EC2 instances can also have disrupting events scheduled to them by AWS. An example of this is degraded hardware maintenance where the instance needs to be stopped and started to move to new hardware. As a CAPA cluster operator I have no way to notice these scheduled events and react appropriately. This is highly undesirable when e.g the instance is a control plane machine. I'd like CAPA to take advantage of event bridge to capture these scheduled events and reflect them in a condition so as cluster operator I can take manual/automated intervention.

May 27 '22 12:05 enxebre

/milestone v1.6.0

Jul 25 '22 16:07 richardcase

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Oct 23 '22 18:10 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Dec 10 '22 18:12 k8s-triage-robot

@Ankitasw - shall we look at the graduation in the new year?

Dec 15 '22 14:12 richardcase

Yes, i started it a while ago, but got busy focussing on E2E, my focus after E2E things would be this.

Dec 15 '22 14:12 Ankitasw

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Jan 14 '23 14:01 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen

Mark this issue as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Jan 14 '23 14:01 k8s-ci-robot

/remove-lifecycle rotten /priority important-longterm

Feb 09 '23 08:02 richardcase

hey @Ankitasw , is there a way to track the progress on this? We are very interested on https://github.com/kubernetes-sigs/cluster-api-provider-aws/issues/2574 , but it seems to be blocked by this issue. Thank you!

Mar 22 '23 11:03 fiunchinho

This is in our radar, would try to get this prioritized soon.

Mar 28 '23 13:03 Ankitasw

/lifecycle active

May 23 '23 10:05 Ankitasw

This issue is labeled with priority/important-soon but has not been updated in over 90 days, and should be re-triaged. Important-soon issues must be staffed and worked on either currently, or very soon, ideally in time for the next release.

You can:

Confirm that this issue is still relevant with /triage accepted (org members only)
Deprioritize it with /priority important-longterm or /priority backlog
Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

Jan 19 '24 00:01 k8s-triage-robot

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Apr 18 '24 00:04 k8s-triage-robot

hey @Ankitasw , just checking in :slightly_smiling_face: . Any progress to share?

Apr 25 '24 16:04 fiunchinho

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

May 25 '24 17:05 k8s-triage-robot

cluster-api-provider-aws cluster-api-provider-aws copied to clipboard

Graduation of EventBridge Feature

cluster-api-provider-aws
cluster-api-provider-aws copied to clipboard