[Bug] SIGKILL in CI
🐛 Bug Report
Recent CI have been failing due to SIGKILLS, which are likely OOM issues. Possibly due to the machines used in CI?
Example Error:
Summary [ 994.564s] 21 tests run: 19 passed (9 slow), 2 failed, 643 skipped
SIGKILL [ 33.113s] snarkvm-synthesizer-program::lib instruction::hash::test_hash_sha3_512_native_is_consistent
SIGKILL [ 125.955s] snarkvm-synthesizer-program::lib instruction::hash::test_hash_sha3_384_native_is_consistent
Link here
I have had issues like this too, but mostly during compilation. What I usually did is reduce the number of concurrent build jobs or tests. It would be nice to have CI machines with more memory to avoid this.
Cc @meddle0x53 as expected, you can iterate towards the right configuration. Don't aim for 100% success rates as we will also prune some of the tests
In my branch for improving the CI runs, I've been triggering the CI countless of times with different changes and at least until now I haven't seen SIGKILL. So maybe this will be happening almost never after that gets merged.
As a whole, even without my latest changes, to fix that just set --test-threads=X for the failing task. If the task has this flag set X to something lower, if it doesn't X=16 or X=8 initially and lower it if SIGKILL continues to show up.