envoy icon indicating copy to clipboard operation
envoy copied to clipboard

//test/extensions/filters/network/rbac:integration_test flakily times out

Open RyanTheOptimist opened this issue 1 year ago • 10 comments

https://dev.azure.com/cncf/4684fb3d-0389-4e0b-8251-221942316e06/_build/results?buildId=177453&tracking_data=ew0KICAic291cmNlIjogIlNsYWNrUGlwZWxpbmVzQXBwIiwNCiAgInNvdXJjZV9ldmVudF9uYW1lIjogIm1zLnZzcy1waXBlbGluZXMucnVuLXN0YXRlLWNoYW5nZWQtZXZlbnQiDQp9 https://dev.azure.com/cncf/envoy/_build/results?buildId=177474&view=logs&j=4930ecaf-18f4-5b3c-dea3-309729c3b3ae&t=573d8780-d7b9-52e3-b4e0-a89886b0b9ff&l=2823

Yan, can you take a look? Looks like it might be related to https://github.com/envoyproxy/envoy/pull/35531

RyanTheOptimist avatar Aug 09 '24 21:08 RyanTheOptimist

@antoniovleonti it looks like this might be your PR.

RyanTheOptimist avatar Aug 09 '24 21:08 RyanTheOptimist

I find this strange because I only made a change to the http filter while it's the network filter test that's timing out.

antoniovleonti avatar Aug 13 '24 13:08 antoniovleonti

Yeah, this should be caused by https://github.com/envoyproxy/envoy/pull/33875.

antoniovleonti avatar Aug 13 '24 13:08 antoniovleonti

@yangminzhu

antoniovleonti avatar Aug 13 '24 13:08 antoniovleonti

The timeout flake still exists at head. I don't know if we want to forward fix or revert but I'm going to make a PR to un-revert https://github.com/envoyproxy/envoy/pull/35531.

antoniovleonti avatar Aug 13 '24 14:08 antoniovleonti

I reproduced with bazel test //test/extensions/filters/network/rbac:integration_test --runs_per_test=10000 but it looks like a smaller number would work too.

antoniovleonti avatar Aug 13 '24 14:08 antoniovleonti

I'm wondering if this can be fixed by just reducing the durations used in that test. I'm testing this now.

antoniovleonti avatar Aug 13 '24 14:08 antoniovleonti

No luck. PTAL @yangminzhu

antoniovleonti avatar Aug 13 '24 14:08 antoniovleonti

The RBAC test seems to be highly flaky, and we've experienced a few failures every day. If #33875 is the cause, should it be reverted?

adisuissa avatar Aug 20 '24 20:08 adisuissa

If https://github.com/envoyproxy/envoy/pull/33875 is the cause, should it be reverted?

@adisuissa @yangminzhu

i was just tracking this again as its an ongoing ~frequent flake and also came to the conclusion that this PR is the culprit (apologies @RyanTheOptimist i think i suggested the other previously)

im going to raise a revert PR and we can figure out from there

phlax avatar Aug 29 '24 10:08 phlax