ROC_SHMEM icon indicating copy to clipboard operation
ROC_SHMEM copied to clipboard

Add address sanitizer build options

Open abouteiller opened this issue 8 months ago • 3 comments

Also a bit of documentation on how to debug

TODOs:

  • [x] compile with xnack+
  • [x] document requirements for compiling/running with asan
  • [ ] bug: code crashes when executed under asan

abouteiller avatar Apr 15 '25 19:04 abouteiller

@abouteiller would it make sense to add an ASAN build to the CI at one point?

edgargabriel avatar Apr 17 '25 15:04 edgargabriel

We use the following hip function that appears incompatible with ASAN

error: 'invalid kernel file'(218) at /home/bouteill/rocshmem/rocSHMEM/src/backend_bc.cpp:72

   69   int* device_backend_proxy_addr{nullptr};
   70   CHECK_HIP(
   71   ¦ ¦ hipGetSymbolAddress(reinterpret_cast<void**>(&device_backend_proxy_addr),
   72   ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ HIP_SYMBOL(device_backend_proxy)));

Additional problem spot:

 274 __host__ void set_internal_ctx(rocshmem_ctx_t *ctx) {
 275   CHECK_HIP(hipMemcpyToSymbol(HIP_SYMBOL(ROCSHMEM_CTX_DEFAULT), ctx,
 276                               sizeof(rocshmem_ctx_t), 0,
 277                               hipMemcpyHostToDevice));

abouteiller avatar Apr 17 '25 17:04 abouteiller

I have obtained a contact person to help me look into why HipGetSymbolAddress crashes when compiled with ASAN.

abouteiller avatar May 20 '25 14:05 abouteiller