chapel icon indicating copy to clipboard operation
chapel copied to clipboard

[Bug]: `CHPL_ROCM_PATH` is set to the wrong ROCm installation path

Open Guillaume-Helbecque opened this issue 5 months ago • 6 comments

Summary of Problem

Description: I tried to build Chapel 2.1 on a system where ROCm 6.0.3 is the default and ROCm 5.4.6 is loaded. The reason for this is that ROCm 6.0.3 is not supported by Chapel 2.1. In this configuration, I got the error Error: command not found: /***/***/***/rocm/llvm/bin/llvm-config. The issue here is that Chapel set CHPL_ROCM_PATH to the default ROCm installation path, which is not supported, instead of the one I loaded: /***/***/***/rocm/5.4.6/llvm/bin/llvm-config. Manually setting CHPL_ROCM_PATH=/***/***/***/rocm/5.4.6 fixes the issue.

~[edit: After a quick discussion with the experts managing the system, it seems that the issue may not come from how Chapel detects ROCm installation, but how it tries to find llvm-config. The following subdirectories in the path in the error message are those from the ROCm 5.4.6 module. However, for some reason, rather than looking in rocm/5.4.6/llvm, it removes the 5.4.6 from the directory where it goes looking. Exporting CHPL_LLVM_CONFIG=/***/***/***/rocm/5.4.6/llvm/bin/llvm-config also solve the issue.]~

If I'm not wrong, the heuristic that searches for the ROCm installation path is using which hipcc and then detects the path. However, executing which hipcc on the system seems to give me the good path: /***/***/***/rocm/5.4.6/bin/hipcc. This suggests that an issue may occur when Chapel have to choose between two (or more) possible installation paths.

May be related to #23542.

Is this issue currently blocking your progress? No. Setting path(s) manually makes things work.

Guillaume-Helbecque avatar Sep 17 '24 08:09 Guillaume-Helbecque