chapel [Bug]: multiple kernels with a 2D domain on remote variables results in internal error

[Bug]: multiple kernels with a 2D domain on remote variables results in internal error

Open jabraham17 opened this issue 6 months ago • 6 comments

Summary of Problem

The following code produces the error "gpu-nvidia.c:292: Error calling CUDA function: an illegal memory access was encountered".

const D = {0..<10, 0..<10};
on here.gpus[0] var A: [D] bool;
on here.gpus[0] var B: [D] bool;
on here.gpus[0] {
  const DD = D; // localize domain
  forall idx in DD do B = A[idx];
  var neq: [DD] bool;
  foreach idx in DD do neq[idx] = A[idx] != B[idx];
}

There are two kernels in this code, the forall and the foreach. Commenting out one or the other results makes the error go away. Also note that D is a 2D domain, if its 1D then the error does not occur. Lastly, changing the declaration of A and B to be declared inside the on block (instead of being remote variable declarations) makes the error go away.

Configuration Information

Output of chpl --version: 2.2.0 pre-release
Output of $CHPL_HOME/util/printchplenv --anonymize:

CHPL_TARGET_PLATFORM: linux64
CHPL_TARGET_COMPILER: llvm
CHPL_TARGET_ARCH: x86_64
CHPL_TARGET_CPU: native
CHPL_LOCALE_MODEL: gpu *
  CHPL_GPU: nvidia *
CHPL_COMM: none
CHPL_TASKS: qthreads
CHPL_LAUNCHER: none
CHPL_TIMERS: generic
CHPL_UNWIND: none
CHPL_MEM: jemalloc
CHPL_ATOMICS: cstdlib
CHPL_GMP: bundled
CHPL_HWLOC: bundled
CHPL_RE2: bundled
CHPL_LLVM: bundled *
CHPL_AUX_FILESYS: none

Back-end compiler and version, e.g. gcc --version or clang --version: LLVM 18

Jul 29 '24 23:07 jabraham17

chapel chapel copied to clipboard

[Bug]: multiple kernels with a 2D domain on remote variables results in internal error

Summary of Problem

Configuration Information

chapel
chapel copied to clipboard