HIP icon indicating copy to clipboard operation
HIP copied to clipboard

Fix `HIT.cmake` handling of no architectures found

Open msimberg opened this issue 2 years ago • 1 comments

When trying to install HIP on a system without any devices available (such as a login node without GPUs on a cluster) the CMake configuration fails in HIT.cmake like this:

-- CMAKE_TESTING_TOOL:
-- CMAKE HIP ARCHITECTURES: OFF
Traceback (most recent call last):
  File "/opt/rocm/bin/rocm_agent_enumerator", line 257, in <module>
    main()
  File "/opt/rocm/bin/rocm_agent_enumerator", line 241, in main
    target_list = readFromKFD()
  File "/opt/rocm/bin/rocm_agent_enumerator", line 193, in readFromKFD
    for node in sorted(os.listdir(topology_dir)):
FileNotFoundError: [Errno 2] No such file or directory: '/sys/class/kfd/kfd/topology/nodes/'
-- ROCm Agent Enumurator Result: 1
CMake Error at /tmp/simbergm/spack-stage/spack-stage-hip-5.4.3-6riler52zjomlj3yrpd3fntoq77twl2n/spack-src/tests/hit/HIT.cmake:49 (string):
  string sub-command REPLACE requires at least four arguments.
Call Stack (most recent call first):
  CMakeLists.txt:461 (include)


-- ROCm Agent Enumurator found no valid architectures
-- Configuring incomplete, errors occurred!

This is building HIP through spack (I can provide instructions to reproduce this if needed).

This PR fixes the call to string(REPLACE ...) to properly quote HIP_GPU_ARCH so that the call doesn't fail. Alternatively, or additionally, you could probably do something with the non-zero exit code of rocm_agent_enumerator, but I don't know if it can have a non-zero exit code but still list some architectures? So I haven't changed anything regarding that.

As a flyby this fixes some typos of "Enumurator".

msimberg avatar Apr 18 '23 16:04 msimberg

Ping? Any comments on this?

msimberg avatar May 17 '23 13:05 msimberg