haskell-language-server icon indicating copy to clipboard operation
haskell-language-server copied to clipboard

Use retry action, don't retry otherwise

Open michaelpj opened this issue 3 years ago • 6 comments

This uses a action for retrying steps, which is a bit neater, and lets us more clearly specify what's going on, as well as controlling the number of retries independently.

I think it's better to specifically use this on the test suites that we believe to be flaky, rather than adding a lot of noise by doing this on every test invocation.

michaelpj avatar Dec 16 '22 12:12 michaelpj

Let's fix flaky tests properly. I propose we invert this patch to retry all testsuites thrice unconditionally and fail if any run fails. Then we will have a list of most flaky tests.

I've started a patch to fix flaky testsuite items in ghcide.

wz1000 avatar Dec 21 '22 12:12 wz1000

Let's fix flaky tests properly. I propose we invert this patch to retry all testsuites thrice unconditionally and fail if any run fails.

I'm definitely in favour of fixing flaky tests, I just want our CI to be passing for most people instead of requiring constant restarting as it does right now. Then we can make a ticket for each flaky job to work on it. Having things fail a lot is an incentive to fix the tests for sure, but it also just slows everything down tons.

michaelpj avatar Dec 21 '22 12:12 michaelpj

That said if you think you can actually fix the flakiness soon then go for it!

michaelpj avatar Dec 21 '22 14:12 michaelpj

The retry action also seems to add a warning to each run with retries by default (warning_on_retry). So flaky tests can be identified by going through past runs (and not failing pull requests that didn't cause them). Haven't seen this action before, but looks to me like a nice improvement over the current state.

https://github.com/nick-fields/retry#warning_on_retry

andys8 avatar Dec 21 '22 19:12 andys8

@wz1000 did you get most of the flaky tests? I'd be happy to just remove the retrying on everything if you think it's not needed now.

michaelpj avatar Jan 10 '23 12:01 michaelpj

No, #3423 seems like an issue that results in quite some flakiness throughout the testsuite but it will require a bit of work to resolve.

wz1000 avatar Jan 10 '23 12:01 wz1000