ExaCA icon indicating copy to clipboard operation
ExaCA copied to clipboard

Temperature unit test hang with multiple threads and ranks

Open streeve opened this issue 2 years ago • 2 comments

ExaCA_Temperature_test_OPENMP_np_2_nt_2 failure - only seen once, but probably indicates an unlikely race condition

streeve avatar Jan 08 '24 15:01 streeve

Still unsure of the exact bug, but I've narrowed it down to testReadTemperatureData - some MPI ranks do not seem to store any of the appropriate temperature data (might not be reading the file at all?)

MattRolchigo avatar Aug 29 '24 16:08 MattRolchigo

Observed an occurrence today of a similar hanging unit test, and an outright failure in checkTemperatureResults - this seemed to be Finch-ExaCA coupling specific, but could also be related

MattRolchigo avatar Sep 27 '24 19:09 MattRolchigo