sel4test
sel4test copied to clipboard
pc99: SCHED0011 sometimes times out
In this run: https://github.com/seL4/util_libs/actions/runs/4920068732/jobs/8789972531?pr=156#step:4:3957 we're getting:
Tue, 09 May 2023 02:21:49 GMT Test SCHED0010 passed
Tue, 09 May 2023 02:21:49 GMT </testcase>
Tue, 09 May 2023 02:21:49 GMT <testcase classname="sel4test" name="SCHED0011">
Tue, 09 May 2023 02:21:49 GMT Running test SCHED0011 (Test scheduler accuracy)
Tue, 09 May 2023 02:36:11 GMT
Tue, 09 May 2023 02:36:11 GMT [[Timeout]]
Tue, 09 May 2023 02:36:11 GMT None
Tue, 09 May 2023 02:36:12 GMT
Tue, 09 May 2023 02:36:12 GMT console_run returned -1
Tue, 09 May 2023 02:36:12 GMT Shutting down haswell4
Note the 15min passing between start of SCHED011 and the timeout.
The failing config was PC99_debug_MCS_clang_32
Seem,s to happen randomly. Today https://github.com/seL4/seL4/actions/runs/12040941622/job/33573351124#step:4:23728 on the gcc 32-bit build
It's x86, you can get an NMI at any time which will take unknown amount of time to be handled by the system firmware. Still, 2 ms deviation seems like a lot.
SCHED0007 and SCHED0008 don't seem to clean up all spawned threads, not sure what effect that has on further tests.
This might have been fixed with my sel4bench bugfixes recently, but either way haven't seen this for a while. Should we close this?
I have also not seen this in a very long time. Would it be possible to run a stress test to confirm? (E.g. that one test a few hundred times -- that should exceed the frequency of previous occurrences by a lot).
I have also not seen this in a very long time. Would it be possible to run a stress test to confirm? (E.g. that one test a few hundred times -- that should exceed the frequency of previous occurrences by a lot).
I'll give it a go. My stress tests usually run for at least an hour, or 24 hours, depending on what they test. Even to just filter out you need to run things for at least a thousand times usually.
That said, if it was a corner case bug or race in the timer driver, then repeating the same test probably won't trigger it again.
(Remind me if I forget to do this.)
Agreed, it's not going to be certain, but it should be better than "we haven't seen this in a while" :-)