parsec
parsec copied to clipboard
Recursive Termdet is broken
Describe the bug
Termdet causes recursive to crash.
To Reproduce
Steps to reproduce the behavior:
- Checkout b958ae9f
- Checkout parsec 9fc74b6f1
- Compile with the following options
../dplasma/configure --disable-fortran --with-platform=macosx --enable-debug=paranoid\,noisier
- See error
tests/testing_dpotrf -N 10000 -t 200 -z 100 -x -v ─╯
W@00000 /!\ DEBUG LEVEL WILL PROBABLY REDUCE THE PERFORMANCE OF THIS RUN /!\.
#+++++ cores detected : 4
#+++++ nodes x cores + gpu : 1 x 4 + 0 (4+0)
#+++++ thread mode : THREAD_SERIALIZED
#+++++ P x Q : 1 x 1 (1/1)
#+++++ M x N x K|NRHS : 10000 x 10000 x 1
#+++++ MB x NB : 200 x 200
#+++++ HMB x HNB : 100 x 100
[aurelien16:88452] *** Process received signal ***
[aurelien16:88452] Signal: Segmentation fault: 11 (11)
[aurelien16:88452] Signal code: Address not mapped (1)
[aurelien16:88452] Failing at address: 0x300000008
[aurelien16:88452] [ 0] 0 libsystem_platform.dylib 0x00007ff8042f2dfd _sigtramp + 29
[aurelien16:88452] [ 1] 0 ??? 0x0000600000189280 0x0 + 105553117876864
[aurelien16:88452] [ 2] 0 libparsec.4.0.0.dylib 0x00000001012c737a parsec_atomic_fetch_dec_int32 + 26
[aurelien16:88452] [ 3] 0 libparsec.4.0.0.dylib 0x00000001012c734c parsec_taskpool_termination_detected + 76
[aurelien16:88452] [ 4] 0 libparsec.4.0.0.dylib 0x00000001012ffecf parsec_termdet_local_termination_detected + 591
[aurelien16:88452] [ 5] 0 libparsec.4.0.0.dylib 0x00000001012ff185 parsec_termdet_local_taskpool_addto_nb_tasks + 1013
[aurelien16:88452] [ 6] 0 libparsec.4.0.0.dylib 0x00000001012d8174 parsec_release_task_to_mempool_update_nbtasks + 68
[aurelien16:88452] [ 7] 0 libdplasma.2.0.dylib 0x0000000103f74bae release_task_of_dtrsm_LUT_dtrsm + 126
[aurelien16:88452] [ 8] 0 libparsec.4.0.0.dylib 0x00000001012c7fbc __parsec_complete_execution + 204
[aurelien16:88452] [ 9] 0 libparsec.4.0.0.dylib 0x00000001012c8143 __parsec_task_progress + 323
[aurelien16:88452] [10] 0 libparsec.4.0.0.dylib 0x00000001012c86d4 __parsec_context_wait + 980
[aurelien16:88452] [11] 0 libparsec.4.0.0.dylib 0x00000001012a5c7e __parsec_thread_init + 1230
[aurelien16:88452] [12] 0 libsystem_pthread.dylib 0x00007ff8042dd4e1 _pthread_start + 125
[aurelien16:88452] [13] 0 libsystem_pthread.dylib 0x00007ff8042d8f6b thread_start + 15
[aurelien16:88452] *** End of error message ***
[1] 88452 segmentation fault tests/testing_dpotrf -N 10000 -t 200 -z 100 -x -v