chapel icon indicating copy to clipboard operation
chapel copied to clipboard

`assertOnGpu` causes compiler segfault with `CHPL_LOCALE_MODEL!=gpu`

Open e-kayrakli opened this issue 1 year ago • 3 comments

While working on a GPU/non-GPU portability branch, I realized that assertOnGpu causes segfaults during codegen. This is because chpl_assert_on_gpu runtime function is in runtime/include/gpu/chpl-gen-gpu-common.h. That file isn't included unless you use GPU locale model.

We should probably do one of:

  1. assertOnGpu fails at compile time with CHPL_LOCALE_MODEL=flat
  2. assertOnGpu fails at execution time with CHPL_LOCALE_MODEL=flat
  3. assertOnGpu causes GPU transformations to fire even with CHPL_LOCALE_MODEL=flat and fail only if the GPU is not eligible. It should be ignored at execution time.

(1) is interesting, but can be annoying if you want to do some portability checks. But OTOH, it'd be doing literally what it supposed to. (2) is a good halfway solution. I don't like (3).

Potential use cases for (1) and (3) can also probably be done with CHPL_LOCALE_MODEL=gpu and CHPL_GPU=cpu.

On my branch, I can sweep it under the rug for now. But (2) was very easy to do:

diff --git a/runtime/include/chpl-gen-includes.h b/runtime/include/chpl-gen-includes.h
index 5f8444e605..ec97080355 100644
--- a/runtime/include/chpl-gen-includes.h
+++ b/runtime/include/chpl-gen-includes.h
@@ -73,6 +73,10 @@ chpl_localeID_t chpl_gen_getLocaleID(void)
   return localeID;
 }

+static inline void chpl_assert_on_gpu(int32_t lineno, int32_t filenameIdx) {
+  chpl_error("assertOnGpu() failed", lineno, filenameIdx);
+}
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/runtime/include/gpu/chpl-gpu-gen-common.h b/runtime/include/gpu/chpl-gpu-gen-common.h
index 40c54cc05b..3f949c9016 100644
--- a/runtime/include/gpu/chpl-gpu-gen-common.h
+++ b/runtime/include/gpu/chpl-gpu-gen-common.h
@@ -71,9 +71,6 @@ __device__ static inline void chpl_gen_comm_put(void* addr, c_nodeid_t node,
 MAYBE_GPU static inline void chpl_gpu_write(const char *str) { printf("%s", str); }

 __device__ static inline void chpl_assert_on_gpu(int32_t lineno, int32_t filenameIdx) { /* no op */ }
-__host__ static inline void chpl_assert_on_gpu(int32_t lineno, int32_t filenameIdx) {
-  chpl_error("assertOnGpu() failed", lineno, filenameIdx);
-}

 __device__ static inline unsigned int chpl_gpu_clock(void) {
   return (unsigned int)clock();

e-kayrakli avatar Jul 27 '23 00:07 e-kayrakli