parsec
parsec copied to clipboard
Termination detection fault with dtd
Describe the bug
Seen only once on #321, need to see if it also happens on master
To Reproduce
29302 Command: "/apps/spacks/2023-08-14/opt/spack/linux-rocky9-x86_64/gcc-11.3.1/openmpi-4.1.5-2rgaqk2wseegpmbdbbygvwrljccjaqsk/bin/mpiexec" "-n" "4" "dsl/did/dtd_test_task_insertion"
...
29329 dtd_test_task_insertion: /home/bouteill/parsec/dplasma/parsec/parsec/mca/termdet/local/termdet_local_module.c:114: parsec_termdet_local_termination_dete
cted: Assertion `tp->tdm.monitor == PARSEC_TERMDET_LOCAL_TERMINATED' failed.
29330 [leconte:4113702] *** Process received signal ***
29340 [leconte:4113702] [ 7] /home/bouteill/parsec/dplasma/build.cuda/parsec/parsec/libparsec.so.4(+0xb042f)[0x7fa93d0fc42f]
29341 [leconte:4113702] [ 8] /home/bouteill/parsec/dplasma/build.cuda/parsec/parsec/libparsec.so.4(parsec_release_dtd_task_to_mempool+0x32)[0x7fa93d0ce596]
29342 [leconte:4113702] [ 9] /home/bouteill/parsec/dplasma/build.cuda/parsec/parsec/libparsec.so.4(__parsec_complete_execution+0xc6)[0x7fa93d0b7f70]
29343 [leconte:4113702] [10] /home/bouteill/parsec/dplasma/build.cuda/parsec/parsec/libparsec.so.4(__parsec_task_progress+0x12e)[0x7fa93d0b80ca]
29344 [leconte:4113702] [11] /home/bouteill/parsec/dplasma/build.cuda/parsec/parsec/libparsec.so.4(__parsec_context_wait+0x2ee)[0x7fa93d0b8c0a]
29345 [leconte:4113702] [12] /home/bouteill/parsec/dplasma/build.cuda/parsec/parsec/libparsec.so.4(+0x49343)[0x7fa93d095343]