chapel icon indicating copy to clipboard operation
chapel copied to clipboard

within a forall loop containing a taken continue, a forall loop over a domain literal dereferences nil or hits a LocaleModel error or just segfaults

Open cassella opened this issue 2 years ago • 18 comments

Summary of Problem

The code below nondeterministically attempts to dereference nil, or hits a LocaleModel error, or just segfaults.

Some of the other variants hit just the dereference nil error, or even work, as noted.

Modifying the code to pick one of the variants based on a config var makes the problem go away.

Steps to Reproduce

Source Code:

var D = {-1..1,-1..1};
var A: [D] bool;

A[(0,0)] = true;

//  writeln(+ reduce ([x in {-1..1}] x)); // works here

forall Exy in D {
  if !A[Exy] then continue;
  var neighbours = 0;

  //writeln(+ reduce ([x in {-1..1}] x)); // nil

  //  writeln(+ reduce ([xy in D] xy)); // works

  //writeln(+ reduce ([xy in {-1..1}] xy)); // nil

  writeln(+ reduce ([xy in {-1..1,-1..1}] xy)); // nil, or child from flat

  //  neighbours = + reduce ([xy in {-1..1,-1..1}] A[Exy+xy]); // nil

  // forall xy in {-1..1,-1..1} do writeln(A[Exy+xy]); // works

  // neighbours = + reduce ([xy in D] A[Exy+xy]);      // works

  //  writeln(+ reduce ({-1..1,-1..1})); // works

  writeln(neighbours);
}

Execution command:

[edit: this is repeated invocation of the same binary. ]

fortytwo@enodia:~/src/chapel (main)$ ./foo 
$CHPL_HOME/modules/internal/ChapelDomain.chpl:1069: error: attempt to dereference nil
Segmentation fault (core dumped)

fortytwo@enodia:~/src/chapel (main)$ ./foo 
Segmentation fault (core dumped)

fortytwo@enodia:~/src/chapel (main)$ ./foo 
$CHPL_HOME/modules/internal/localeModels/flat/LocaleModel.chpl:122: error: halt reached - requesting a child from a flat LocaleModel locale
Segmentation fault (core dumped)
$CHPL_HOME/modules/internal/ChapelDomain.chpl:1069: error: attempt to dereference nil
[Switching to Thread 0x7fffeffff640 (LWP 526700)]

Thread 4 "foo" hit Breakpoint 1, gdbShouldBreakHere () at gdb.c:28
28      void gdbShouldBreakHere(void) {printf("%s", "");}
(gdb) i s
#0  gdbShouldBreakHere () at gdb.c:28
#1  0x00005555555ee74b in chpl_exit_common (status=1, all=0) at chplexit.c:38
#2  0x00005555555ee7a5 in chpl_exit_any (status=1) at chplexit.c:60
#3  0x00005555555ec18c in chpl_error_explicit (message=0x5555556924a4 "attempt to dereference nil", lineno=1069, filename=0x5555556900a8 "$CHPL_HOME/modules/internal/ChapelDomain.chpl")
    at error.c:366
#4  0x00005555555ec280 in chpl_error (message=0x5555556924a4 "attempt to dereference nil", lineno=1069, filenameIdx=58) at error.c:440
#5  0x00005555555e4c3f in chpl_check_nil ()
#6  0x000055555558585d in _do_destroy_chpl ()
#7  0x0000555555585aee in deinit_chpl18 ()
#8  0x000055555558063e in chpl__autoDestroy3 ()
#9  0x00005555555e3486 in coforall_fn_chpl11 ()
#10 0x00005555555e3618 in wrapcoforall_fn_chpl11 ()
#11 0x00005555555f436d in chapel_wrapper (arg=0x7ffff5763710) at tasks-qthreads.c:800
#12 0x000055555565b1ae in qthread_wrapper (ptr=0x7ffff57636d0) at /home/fortytwo/src/chapel/third-party/qthread/qthread-src/src/qthread.c:2184
$CHPL_HOME/modules/internal/localeModels/flat/LocaleModel.chpl:122: error: halt reached - requesting a child from a flat LocaleModel locale
[Switching to Thread 0x7ffff61ff640 (LWP 526718)]

Thread 2 "foo" hit Breakpoint 1, gdbShouldBreakHere () at gdb.c:28
28      void gdbShouldBreakHere(void) {printf("%s", "");}
(gdb) i s
#0  gdbShouldBreakHere () at gdb.c:28
#1  0x00005555555ee74b in chpl_exit_common (status=1, all=0) at chplexit.c:38
#2  0x00005555555ee7a5 in chpl_exit_any (status=1) at chplexit.c:60
#3  0x00005555555ec18c in chpl_error_explicit (message=0x7ffff5754190 "halt reached - requesting a child from a flat LocaleModel locale", lineno=122, 
    filename=0x55555568fa30 "$CHPL_HOME/modules/internal/localeModels/flat/LocaleModel.chpl") at error.c:366
#4  0x00005555555ec280 in chpl_error (message=0x7ffff5754190 "halt reached - requesting a child from a flat LocaleModel locale", lineno=122, filenameIdx=25) at error.c:440
#5  0x00005555555c7080 in halt_chpl14 ()
#6  0x00005555555c647d in halt_chpl ()
#7  0x00005555555a991b in _getChild_chpl3 ()
#8  0x000055555558268d in remove_chpl2 ()
#9  0x000055555558586c in _do_destroy_chpl ()
#10 0x0000555555585aee in deinit_chpl18 ()
#11 0x000055555558063e in chpl__autoDestroy3 ()
#12 0x00005555555e3486 in coforall_fn_chpl11 ()
#13 0x00005555555e3618 in wrapcoforall_fn_chpl11 ()
#14 0x00005555555f436d in chapel_wrapper (arg=0x7ffff5761040) at tasks-qthreads.c:800
#15 0x000055555565b1ae in qthread_wrapper (ptr=0x7ffff5761000) at /home/fortytwo/src/chapel/third-party/qthread/qthread-src/src/qthread.c:2184
#16 0x0000000000000000 in ?? ()
$CHPL_HOME/modules/internal/ChapelDomain.chpl:1069: error: attempt to dereference nil

Thread 3 "foo" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff4dff640 (LWP 526823)]
0x0000555555582670 in remove_chpl2 ()
(gdb) i s
#0  0x0000555555582670 in remove_chpl2 ()
#1  0x000055555558586c in _do_destroy_chpl ()
#2  0x0000555555585aee in deinit_chpl18 ()
#3  0x000055555558063e in chpl__autoDestroy3 ()
#4  0x00005555555e3486 in coforall_fn_chpl11 ()
#5  0x00005555555e3618 in wrapcoforall_fn_chpl11 ()
#6  0x00005555555f436d in chapel_wrapper (arg=0x7ffff5761490) at tasks-qthreads.c:800
#7  0x000055555565b1ae in qthread_wrapper (ptr=0x7ffff5761450) at /home/fortytwo/src/chapel/third-party/qthread/qthread-src/src/qthread.c:2184

Associated Future Test(s): test/parallel/taskPar/nested/forall-with-continue-forall-bracket-expr.chpl #21411 test/parallel/taskPar/nested/forall-with-continue-forall-expr.chpl #21411 test/parallel/taskPar/nested/forall-with-continue-reduction-over-forall-bracket-expr.chpl #21411

Configuration Information

chpl version 1.30.0 pre-release (b'179b81edd3')
  built with LLVM version 14.0.0

CHPL_TARGET_PLATFORM: linux64
CHPL_TARGET_COMPILER: llvm
CHPL_TARGET_ARCH: x86_64
CHPL_TARGET_CPU: native *
CHPL_LOCALE_MODEL: flat
CHPL_COMM: none *
CHPL_TASKS: qthreads
CHPL_LAUNCHER: none
CHPL_TIMERS: generic
CHPL_UNWIND: none
CHPL_MEM: jemalloc
CHPL_ATOMICS: cstdlib
CHPL_GMP: bundled
CHPL_HWLOC: bundled
CHPL_RE2: none
CHPL_LLVM: system
CHPL_AUX_FILESYS: none

gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0

Ubuntu clang version 14.0.0-1ubuntu1

This is with CHPL_COMM=none for clarity. It also hit the problem with CHPL_COMM=gasnet.

cassella avatar Dec 23 '22 22:12 cassella