chapel
chapel copied to clipboard
Support for multidimensional arrays in loop expressions as kernels
Using a multidimensional array in a loop expression prevents the compiler from being able to create kernels. This should work and result in a kernel launch.
For example, the following loop expression cannot be gpuized
on here.gpus[0] {
const D = {1..10, 1..10};
var x: [D] int;
var y: [D] int;
var z = foreach i in x.domain do x(i) != y(i);
}
Compiling with @assertOnGpu
fails to compile, as the compiler determines that the loop expression is not gpuizable. I also confirmed that there is no kernel launch with start/stopVerboseGpu
. Changing D
to be {1..0}
does work and does result in a kernel launch. Also, changing the loop expression into a loop statement also results in a kernel launch, regardless of the domain.
var z: [D] bool;
foreach i in x.domain do z(i) = x(i) != y(i);
The same is true with promotion, which currently fails to be gpuized with multidimensional arrays. The following fails to compile and turn into a kernel, but changing the domain to be {1..10}
does.
on here.gpus[0] {
const D = {1..10, 1..10};
var x: [D] int;
var y: [D] int;
var z = x != y;
}
In the loop expression case, the compiler complains that advance
is not gpu eligible with no further information. In the promotion case, the compiler gives a little more information:
$CHPL_HOME/modules/internal/ChapelArray.chpl:3550: In function 'chpl__initCopy_shapeHelp':
$CHPL_HOME/modules/internal/ChapelArray.chpl:3598: error: Loop is marked with @assertOnGpu but is not eligible for execution on a GPU
$CHPL_HOME/modules/internal/ChapelArray.chpl:1341: note: called function has outer var access
$CHPL_HOME/modules/internal/ChapelArray.chpl:3598: note: reached via call to 'advance' in loop body here
$CHPL_HOME/modules/internal/ChapelArray.chpl:3533: called as chpl__initCopy_shapeHelp(shape: domain(unmanaged DefaultRectangularDom(2,int(64),one)), ir: _ir_chpl_promo1_!=) from function 'chpl__initCopy'
foo2.chpl:9: called as chpl__initCopy(ir: _ir_chpl_promo1_!=, definedConst: bool)
note: generic instantiations are underlined in the above callstack