parsec icon indicating copy to clipboard operation
parsec copied to clipboard

Recursive Termdet is broken

Open abouteiller opened this issue 1 year ago • 1 comments

Describe the bug

Termdet causes recursive to crash.

To Reproduce

Steps to reproduce the behavior:

  1. Checkout b958ae9f
  2. Checkout parsec 9fc74b6f1
  3. Compile with the following options ../dplasma/configure --disable-fortran --with-platform=macosx --enable-debug=paranoid\,noisier
  4. See error
tests/testing_dpotrf  -N 10000 -t 200 -z 100 -x -v                                                    ─╯
W@00000 /!\ DEBUG LEVEL WILL PROBABLY REDUCE THE PERFORMANCE OF THIS RUN /!\.
#+++++ cores detected       : 4
#+++++ nodes x cores + gpu  : 1 x 4 + 0 (4+0)
#+++++ thread mode          : THREAD_SERIALIZED
#+++++ P x Q                : 1 x 1 (1/1)
#+++++ M x N x K|NRHS       : 10000 x 10000 x 1
#+++++ MB x NB              : 200 x 200
#+++++ HMB x HNB            : 100 x 100
[aurelien16:88452] *** Process received signal ***
[aurelien16:88452] Signal: Segmentation fault: 11 (11)
[aurelien16:88452] Signal code: Address not mapped (1)
[aurelien16:88452] Failing at address: 0x300000008
[aurelien16:88452] [ 0] 0   libsystem_platform.dylib            0x00007ff8042f2dfd _sigtramp + 29
[aurelien16:88452] [ 1] 0   ???                                 0x0000600000189280 0x0 + 105553117876864
[aurelien16:88452] [ 2] 0   libparsec.4.0.0.dylib               0x00000001012c737a parsec_atomic_fetch_dec_int32 + 26
[aurelien16:88452] [ 3] 0   libparsec.4.0.0.dylib               0x00000001012c734c parsec_taskpool_termination_detected + 76
[aurelien16:88452] [ 4] 0   libparsec.4.0.0.dylib               0x00000001012ffecf parsec_termdet_local_termination_detected + 591
[aurelien16:88452] [ 5] 0   libparsec.4.0.0.dylib               0x00000001012ff185 parsec_termdet_local_taskpool_addto_nb_tasks + 1013
[aurelien16:88452] [ 6] 0   libparsec.4.0.0.dylib               0x00000001012d8174 parsec_release_task_to_mempool_update_nbtasks + 68
[aurelien16:88452] [ 7] 0   libdplasma.2.0.dylib                0x0000000103f74bae release_task_of_dtrsm_LUT_dtrsm + 126
[aurelien16:88452] [ 8] 0   libparsec.4.0.0.dylib               0x00000001012c7fbc __parsec_complete_execution + 204
[aurelien16:88452] [ 9] 0   libparsec.4.0.0.dylib               0x00000001012c8143 __parsec_task_progress + 323
[aurelien16:88452] [10] 0   libparsec.4.0.0.dylib               0x00000001012c86d4 __parsec_context_wait + 980
[aurelien16:88452] [11] 0   libparsec.4.0.0.dylib               0x00000001012a5c7e __parsec_thread_init + 1230
[aurelien16:88452] [12] 0   libsystem_pthread.dylib             0x00007ff8042dd4e1 _pthread_start + 125
[aurelien16:88452] [13] 0   libsystem_pthread.dylib             0x00007ff8042d8f6b thread_start + 15
[aurelien16:88452] *** End of error message ***
[1]    88452 segmentation fault  tests/testing_dpotrf -N 10000 -t 200 -z 100 -x -v

abouteiller avatar May 11 '23 21:05 abouteiller