quinn
quinn copied to clipboard
3 flaky tests
I ran cargo test on Quinn (05f6e67de633245526d5d2773e27eb75c70b2bdd) 1,003,080 times. This is what I found.
There are 3 flaky tests:
-
tests::single_ack_eliciting_packet_triggers_ack_after_delayfails 0.101% of the time. I collected 1021 occurrences.The flakyness of this test was also noted in #2014.
In one arbitrarily chosen failure, the error was this:
thread 'tests::single_ack_eliciting_packet_triggers_ack_after_delay' panicked at quinn-proto/src/tests/mod.rs:2490:5: assertion `left == right` failed left: Instant { tv_sec: 137912, tv_nsec: 296566897 } right: Instant { tv_sec: 137912, tv_nsec: 218566897 } -
tests::key_update_reorderedfails 0.098% of the time. I collected 983 occurrences.The flakyness of this test was also noted in #1695.
In one arbitrarily chosen failure, the error was this:
thread 'tests::key_update_reordered' panicked at quinn-proto/src/tests/mod.rs:1064:5: assertion `left == right` failed left: 1 right: 0 -
tests::key_update_simplefails 0.015% of the time. I collected 146.This is the rarest of the bunch, and I wasn't able to find evidence that this has been noticed before.
In one arbitrarily chosen failure, the error was this:
thread 'tests::key_update_simple' panicked at quinn-proto/src/tests/mod.rs:1021:5: assertion failed: `None` does not match `Some(Event::Stream(StreamEvent::Readable { id })) if id == s`
I am attaching to this issue grouped.zip, which contains the stdout/stderr of all runs in which the tests failed, grouped by which test failed. These contain terminal color codes, so I recommend you read the files with cat.
https://github.com/gretchenfrage/quinn-scrutinizer
https://github.com/quinn-rs/quinn/pull/2292 fixes the key_update_reordered case. I wouldn't be surprised if key_update_simple was a variation on the same root cause.
I don't immediately see how single_ack_eliciting_packet_triggers_ack_after_delay could be related, but it wouldn't be a shock, considering that initial key phase size is one of the few bits of nondeterminism we have in the test and the repro rate seems similar. If it is related, then it won't have been fixed by the above PR, but would be made consistent by using a constant RNG seed for the endpoint.