gateway-api
gateway-api copied to clipboard
conformance: Conformance tests need a way to test that a percentage of requests succeed or fail
What would you like to be added: In #1243, I added some conformance testing around new BackendRef tests, but one case we couldn't catch is the following:
- There are two backendRefs in the HTTPRoute backendRef list, with equal weights
- One is invalid
- Half of the requests sent through the implementation must get a 500.
Why this is needed: Part of the spec is weighted load balancing, having some tooling to allow us to test this as part of conformance will help a lot.
Downside: It's going to be tricky to write the tests to be not flaky. I suspect that we'll need to provide a target percentage value, add some wiggle room to it, and then send some number of requests. It seems likely to me that the details will matter a lot here.
May need to be a GEP, let's discuss here first.
This definitely feels like it would be flaky to test - would using filters to direct 100% of traffic to each of a valid and invalid backend for the same route be a viable alternative to this, or not sufficient?
Well, in the spec for backendRef, the above behavior is mandated (that is, if there are multiple backends, and one is invalid, then the invalid one should instead produce an equivalent percentage of 500 errors), so ideally we should have a way to test implementations do that in conformance.
That's why I put a target percentage value, and some wiggle room (maybe 50% plus minus 5% or something for the two-backend case).
Having filters direct traffic doesn't meet the spec item that we're trying to test, unfortunately.
I agree that both of the things you're describing are necessary and related. We should probably track both of these here or create a separate issue for the test @mikemorris described. To ensure this isn't flaky we'll likely need a relatively large number of requests. This kind of test may require a "slow" label like upstream Kubernetes tests so we can skip it for faster test runs.
I agree that we should create another issue for the test Mike described.
Well, looking at HTTPRouteFilter it appears the functionality I was envisioning doesn't really exist (so I'm not opening an issue to test it hah) - HTTPRequestMirrorFilter is somewhat similar but would ignore the actual configured backendRefs on the HTTPRoute, I was thinking of something like an HTTPRequestLabelSelectorFilter using https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/ to filter backendRefs by metadata.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale - Mark this issue or PR as rotten with
/lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Reopen this issue with
/reopen - Mark this issue as fresh with
/remove-lifecycle rotten - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied- After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied- After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closedYou can:
- Reopen this issue with
/reopen- Mark this issue as fresh with
/remove-lifecycle rotten- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.