gateway-api
gateway-api copied to clipboard
Chaos Engineering/Fault Injection
What would you like to be added:
An API to inject faults into routes. Typically a fault may be a delay, an HTTP response, or others.
Why this is needed:
Fault injection / Chaos engineering is a somewhat common engineering practice to intentional introduce errors into the system to simulate disaster recovery and other reliability mechanisms.
Prior art:
Slightly related to https://github.com/kubernetes-sigs/gateway-api/issues/2826
How do you see this being implemented, as a testing library that we provide to implementations or just an API?
No, just a HTTPRoute filter or policy. Like https://istio.io/latest/docs/reference/config/networking/virtual-service/#HTTPFaultInjection, for example. Then users or tooling can utilize that to build a holistic chaos engineering strategy
No, just a HTTPRoute filter or policy. Like https://istio.io/latest/docs/reference/config/networking/virtual-service/#HTTPFaultInjection, for example. Then users or tooling can utilize that to build a holistic chaos engineering strategy
Ok, just wanted to be sure. The API specification for this sounds good, but the reason I asked about test libs is that I wouldn't be opposed to discussing having some standard "Gateway API Fault Injection" tooling. 🤔
In any case, I'm supportive of further discussing and working towards a proposal here.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
Hi @craigbox 👋
We see you've dropped the stale on this one, did you have plans or some thoughts on how we can move the conversation forward here?
Hi Shane,
No particular plans or thoughts sorry, just offering some community housekeeping.
Your comment reminded me of a similar one here. I understand the constraints that we all operate under, but I don't think the passage of time has changed the request in a way that suggests it is "stale"; rather, it just hasn't made it to the top of anyone's priority list yet. (To me, "stale" implies that time may have diminished the validity of a a bug report or reduced its impact on a user, which doesn't seem to apply here.)
My intent was to link to this issue in highlighting hat Gateway API doesn't support this feature, meaning Istio users must continue to use the legacy APIs. I think it is a better experience for someone seeing this issue to see that it remains an open request, rather than seeing it "closed" — which can imply that the feature request was not valid.
Gotcha. In case it's helpful we don't consider closed and stale to mean invalid, or "we will never do this", but rather it's often an accurate representation of priority: if no community member can come forward and champion the issue, and there's no support to iterate on it during a release window, then it does not have priority and closed is often how we reflect this when the situation remains for long periods of time (see our documentation on the subject of bumping stale issues for the more official stance for more about our approach).
Another good example is Rate Limiting. I personally feel this is absolutely something we should have in the API, but because it sat for so long and nobody (including myself) had the priority to move it forward, closed as "will re-open when someone is specifically ready to drive this" may help remove some of the ambiguity that comes with "perpetually open without movement for many years".
Neither option is ideal of course, but compromises are always part of the process. In any case, we are glad that you continue to have interest in this feature. Perhaps you could put something on the agenda for an upcoming community meeting to talk about your interest in it, and promote it some to see if some new support can be garnered to move it forward?
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Reopen this issue with
/reopen - Mark this issue as fresh with
/remove-lifecycle rotten - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied- After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied- After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closedYou can:
- Reopen this issue with
/reopen- Mark this issue as fresh with
/remove-lifecycle rotten- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.