Enable ingress conformance tests
Once Ingress v1 support is complete, we should enable the ingress conformance tests in CI
Prerequisite for #1944
Blocked by #403
See: https://github.com/kubernetes-sigs/ingress-controller-conformance
And a separate effort: https://github.com/aledbf/ingress-conformance
Ingress conformance is basically blocked. There's no agreement upstream on what the harness should look like, so verification checks can't be committed.
Removing this from the prioritize backlog because it is blocked upstream.
@jpeach i would prefer we add the blocked tag and leave it in the backlog. we still want to do it at the current priority or as soon as possible given the upstream block. can you please make the changes?
@jpeach i would prefer we add the blocked tag and leave it in the backlog. we still want to do it at the current priority or as soon as possible given the upstream block. can you please make the changes?
I can move it back, but I don't see the benefit of having issues in the prioritized queue that can't be worked on.
Does this still look blocked given the state of https://github.com/kubernetes-sigs/ingress-controller-conformance? (theres some example of running conformance with sonobuoy and kind etc. https://github.com/kubernetes-sigs/ingress-controller-conformance/blob/master/examples/kind/.github/workflows/main.yaml)
Does this still look blocked given the state of https://github.com/kubernetes-sigs/ingress-controller-conformance? (theres some example of running conformance with sonobuoy and kind etc. https://github.com/kubernetes-sigs/ingress-controller-conformance/blob/master/examples/kind/.github/workflows/main.yaml)
See https://github.com/projectcontour/contour/issues/1944
Test results from a couple runs: ingress-conformance-run-1.tar.gz ingress-conformance-run-2.tar.gz
Tests fail intermittently with connection refused, my guess is that is b/c the address/status on the Ingress object is updated before Envoy listeners are actually updated (listeners were torn down from a previous test etc.)
~~It doesn't appear that tests are timing out but only 3 of the 5 features are being run~~ (Using presumably an older ingress-conformance container image)
Runs with the latest ingress-conformance image: ingress-conformance-run-3.tar.gz ingress-conformance-run-4.tar.gz
Failing features:
- ingress class:
Scenario: An Ingress with an invalid ingress class should not send traffic to the matching backend serviceerror Then The Ingress status should not contain the IP address or FQDN waiting for Ingress status should not return an IP address or FQDN- We do not create an ingress class by default in our example deployment, the test creates an Ingress with an empty ingress class annotation (but a specified IngressClass field) and we are incorrectly handling it: https://github.com/kubernetes-sigs/ingress-controller-conformance/blob/79bd068cbb31d77c2a060f61eea795f9791a3d48/features/ingress_class.feature#L17
- relevant issue: https://github.com/projectcontour/contour/issues/2146
- default backend:
- Some intermittent
connection refusedfailures, seemingly nothing legitimate, default backends on Ingress seems to work correctly - Some intermittent
503s returned instead of200s
- Some intermittent
- host rules:
Scenario: An Ingress with a host rule should send TLS traffic to the matching backend service- Have not gotten this one to pass though it should, always get a
connection refusederror connecting to port443
- Have not gotten this one to pass though it should, always get a
Scenario: An Ingress with a wildcard host rule should send traffic to the matching backend serviceWhen I send a GET request to http://bar.foo.com, error Then the response status-code must be 200expected status code 200 but 404 was returned
- path rules:
- Intermittent
connection refusedfailures - Some of the
404failures below seem like intermittent failures as well, they passed in a few previous runs Scenario: An Ingress with exact path rules should send traffic to the matching backend service- fails with:
expected status code 200 but 503 was returned
- fails with:
Scenario: An Ingress with exact path rules should not match requests with trailing slash- fails with:
expected status code 404 but 200 was returned
- fails with:
Scenario: An Ingress with prefix path rules should ignore the request trailing slash and send traffic to the matching backend service- fails with:
expected status code 200 but 404 was returned
- fails with:
Scenario: An Ingress with prefix path rules should and send traffic to the matching backend service- fails with
expected status code 200 but 503 was returned
- fails with
Scenario: An Ingress with prefix path rules should match each labels string prefix- fails with 200 instead of 404
- Intermittent
@sunjayBhatia IIRC at least some of these tests are for ingress V1, which isn't implemented in Contour.
I see mention of a design doc here: https://github.com/projectcontour/contour/issues/2139, is this in flight/somewhere I can see/help with? cc @youngnick
I updated #2139 with some further info, basically, the design for IngressClass is very similar to GatewayClass (the object), which is why I was initially suggesting I do them both. Happy to collaborate on something if you are interested @sunjayBhatia. Ping me and we can discuss in a high-bandwidth way somewhere.
Relevant issue https://github.com/kubernetes-sigs/ingress-controller-conformance/issues/87 along with https://github.com/kubernetes-sigs/service-apis/pull/516
I guess you’re working on this already @sunjayBhatia ? can you tag a release that you feel is appropriate?
This seems like the long pole here since no-one is responding: https://github.com/kubernetes-sigs/ingress-controller-conformance/pull/88
@sunjayBhatia, I'd recommend just pinging Bowei on Kubernetes Slack about this one, since it's a one-liner.
Running ingress conformance is pretty consistently flaky, especially in CI since we create a kind cluster and deploy contour/envoy from scratch before running the tests, without any warming of the HTTP/HTTPS listeners
The ingress conformance tests wait for an Ingress to have its load balancer status address set before sending requests for the tests, however in our current architecture we have a goroutine separate from the Ingress processor that sets status on Ingress objects. This means the Ingress has an address possibly before it is programmed in Envoy and the tests move along accordingly, often sending requests that fail b/c Contour hasn't programmed Envoy yet or Envoy's listeners are still starting. The conformance suite also does not ever retry, it sends one request for each check.
Options here:
- Change Contour to only set status once we have processed an Ingress
- We don't have ACK/NACK so we don't really know when the Ingress is fully configured, but we could move the status updating out from the separate informer/goroutine to the ingress processor as a start
- There isn't actually anything in the Ingress API as far as I can tell that says status has to be set only when the Ingress is "ready" or "admitted", which makes the fact we would have to do this only caused by the quirks/implicit requirements the test suite adds
- Change tests to retry or wait longer for Ingress to be programmed in Envoy
- see second sub-point for the above
Seems like this is another victim of the underspecced nature of the Ingress API. If only there was some replacement that didn't have the same problems! 😝
More seriously, I until we have ACK support, we don't have any way to check if the programming is complete, and even if we did, there's no way to tell from the object (since there's no Conditions). So I think we'll have to just change the tests to retry some number of times or wait. Can you see what other implementations do for their tests here?
A quick run through and I don't see anything in CI for other Ingress controllers, but I didn't do the most digging
I think the most important thing is that Contour can pass the conformance tests eventually, once things are stable. So retries or longer timeouts are okay in my book.
https://github.com/kubernetes-sigs/ingress-controller-conformance/issues/92
waiting on upstream ^ will make a PR but the repo has not been very active so even if I do it probably won't make it for a while
should probably pull this out of 1.16
The Contour project currently lacks enough contributors to adequately respond to all Issues.
This bot triages Issues according to the following rules:
- After 60d of inactivity, lifecycle/stale is applied
- After 30d of inactivity since lifecycle/stale was applied, the Issue is closed
You can:
- Mark this Issue as fresh by commenting
- Close this Issue
- Offer to help out with triage
Please send feedback to the #contour channel in the Kubernetes Slack
The Contour project currently lacks enough contributors to adequately respond to all Issues.
This bot triages Issues according to the following rules:
- After 60d of inactivity, lifecycle/stale is applied
- After 30d of inactivity since lifecycle/stale was applied, the Issue is closed
You can:
- Mark this Issue as fresh by commenting
- Close this Issue
- Offer to help out with triage
Please send feedback to the #contour channel in the Kubernetes Slack