chapel
chapel copied to clipboard
[Bug]: Calling procedure passed by parameter fails in assertOnGpu()
Summary of Problem
Description:
Calling the procedure passed by paremters make assertOnGpu() fail. In the following codes, test1()
directly calls procedure increment()
, while test2()
calls it via the procedure parameter. However, assertOnGpu()
in test2()
fails as shown in the following compilation outputs.
The output also prints GPUFunctionCall.chpl:17: note: call to a primitive that is not fast and local
. I was wondering how to make the procedure passed by parameter "fast and local".
Steps to Reproduce
Source Code:
module GPUFUnctionCall {
use GpuDiagnostics;
proc increment(x: real): real {
return x + 1;
}
proc test1(ref A: [] real, n: int)
{
@assertOnGpu foreach i in 0..#n {
A[i] = increment(A[i]);
}
}
proc test2(ref A: [] real, n: int, f: proc(x: real): real) {
@assertOnGpu foreach i in 0..#n {
A[i] = f(A[i]);
}
}
proc main() {
on here.gpus[0] {
const n = 128;
var A: [0..#n] real;
test1(A, n);
test2(A, n, increment);
writeln(A[5]);
}
}
}
Compile command:
chpl --fast GPUFUnctionCall.chpl
Compilation output
GPUFunctionCall.chpl:15: In function 'test2':
GPUFunctionCall.chpl:16: error: Loop is marked with @assertOnGpu but is not eligible for execution on a GPU
GPUFunctionCall.chpl:17: note: call to a primitive that is not fast and local
GPUFunctionCall.chpl:27: called as test2(A: [domain(1,int(64),one)] real(64), n: int(64), f: proc(x: real): real)
note: generic instantiations are underlined in the above callstack
Configuration Information
Chapel version
warning: The prototype GPU support implies --no-checks. This may impact debuggability. To suppress this warning, compile with --no-checks explicitly
chpl version 2.1.0 pre-release (0198d1f1ea)
built with LLVM version 17.0.6
available LLVM targets: amdgcn, r600, nvptx64, nvptx, aarch64_32, aarch64_be, aarch64, arm64_32, arm64, x86-64, x86
Copyright 2020-2024 Hewlett Packard Enterprise Development LP
Copyright 2004-2019 Cray Inc.
(See LICENSE file for more details)
Chapel configuration
CHPL_TARGET_PLATFORM: linux64
CHPL_TARGET_COMPILER: llvm
CHPL_TARGET_ARCH: x86_64
CHPL_TARGET_CPU: native
CHPL_LOCALE_MODEL: gpu *
CHPL_GPU: nvidia
CHPL_COMM: none
CHPL_TASKS: qthreads *
CHPL_LAUNCHER: none
CHPL_TIMERS: generic
CHPL_UNWIND: none
CHPL_MEM: jemalloc
CHPL_ATOMICS: cstdlib
CHPL_GMP: none
CHPL_HWLOC: bundled
CHPL_RE2: bundled
CHPL_LLVM: bundled *
CHPL_AUX_FILESYS: none