riscv-hobby-os icon indicating copy to clipboard operation
riscv-hobby-os copied to clipboard

Integration test is flaky

Open rtfb opened this issue 3 years ago • 1 comments

It seems to be very rare, but I just stumbled into a case where the integration test broke because the parked hart printed its status faster than the init hart did:

--- testdata/want-output-u64.txt	2022-01-09 14:54:24.705689947 +0000
+++ out/test-output-u64.txt	2022-01-09 14:55:07.326003981 +0000
@@ -1,8 +1,8 @@
+cpu parked: 1
 kinit: cpu 0
 Reading FDT...
 FDT ok
 bootargs: dry-run
 kprintf test several params: foo, 0xF10A, 0
-cpu parked: 1
 KKKK
 qemu-launcher: killing qemu due to timeout
make: *** [Makefile:94: out/test-output-u64.txt] Error 1
Error: Process completed with exit code 2.

This failure makes sense, the harts can execute independently. We need to invent some synchronisation primitive in order to get guaranteed output order.

rtfb avatar Jan 09 '22 15:01 rtfb

The issue with parked harts was fixed in fe3db6a9a55e6b2db863247f60a2b7b49d07d2bf by simply not printing that string. However, the tests are now flaky in another way:

--- testdata/want-smoke-test-output-u64.txt	2022-10-20 17:29:02.503101310 +0000
+++ out/smoke-test-output-u64.txt	2022-10-20 17:31:46.3569[8](https://github.com/rtfb/riscv64-in-qemu/actions/runs/3291666209/jobs/5426107546#step:8:9)66[9](https://github.com/rtfb/riscv64-in-qemu/actions/runs/3291666209/jobs/5426107546#step:8:10)2 +0000
@@ -19,5 +19,6 @@
 1    S      sh
 4    S      hang
 6    R      ps
-
+q
 qemu-launcher: killing qemu due to timeout
+emu-system-riscv64: terminating on signal [15](https://github.com/rtfb/riscv64-in-qemu/actions/runs/3291666209/jobs/5426107546#step:8:16) from pid 44[16](https://github.com/rtfb/riscv64-in-qemu/actions/runs/3291666209/jobs/5426107546#step:8:17) (python3)
make: *** [Makefile:227: out/smoke-test-output-u64.txt] Error 1

This seems to be predominantly happening with the 64 bit version for some reason.

rtfb avatar Oct 20 '22 18:10 rtfb