alpaka icon indicating copy to clipboard operation
alpaka copied to clipboard

Warning when compiling for HIP

Open sbastrakov opened this issue 2 years ago • 6 comments

While investigating on our internal AMD system with the HIP backend, there is a warning (treated as error when building alpaka tests)

In file included from /home/bastra54/src/alpaka/include/alpaka/rand/RandUniformCudaHipRand.hpp:31:
In file included from /opt/rocm-5.2.1/include/hiprand/hiprand_kernel.h:54:
In file included from /opt/rocm-5.2.1/include/hiprand/hiprand_kernel_hcc.h:37:
In file included from /opt/rocm-5.2.1/include/rocrand/rocrand_kernel.h:28:
/opt/rocm-5.2.1/include/rocrand/rocrand_common.h:73:6: error: "Disabled inline asm, because the build target does not support it." [-Werror,-W#warnings]
    #warning "Disabled inline asm, because the build target does not support it."

I am not sure why it does not appear in our CI, but then it seems like we were just lucky to avoid it till something gets updated. @psychocoderHPC told me it occurs for PIConGPU on AMD as well, but there we don't treat warnings as errors.

This is for HIP 5.2.1. This could be an issue on the HIP side, but then we have to work around it or silence the warning somehow on our side.

sbastrakov avatar Sep 02 '22 08:09 sbastrakov

After investigating, the issue is that by default on that machine I compiled for a few architectures, including gfx1030 which is not supported by ROCrand I guess, or at least that was causing the issue. Once I explicitly set only gfx906 the compilation went fine.

Still, since we observe something similar for PIConGPU on AMD, may require a fix or further investigation.

sbastrakov avatar Sep 02 '22 09:09 sbastrakov

I know we like to compile our own compilers and runtimes on our internal systems instead of using pre-built binaries. Is this the case for the ROCm installation, too? If so, have we built the libraries with gfx1030 support?

j-stephan avatar Sep 02 '22 11:09 j-stephan

Tbh i have no idea, it's the second time I'm using that machine. @psychocoderHPC do you know?

sbastrakov avatar Sep 02 '22 13:09 sbastrakov

I know we like to compile our own compilers and runtimes on our internal systems instead of using pre-built binaries. Is this the case for the ROCm installation, too? If so, have we built the libraries with gfx1030 support?

We use pre-build binaries from the apt repositories for our internal development systems. As I know there is no need to compile HIP compilers to support a special architecture. The architecture is set during the compilation of the user application. There is a high possibility that rocrand is not supported for gfx1030. I assume it is a bug in rocrand. gfx10XX was introduced end of last or beginning of this year into the rocm eco system, I assume nobody tested rocrand with this architecture.

psychocoderHPC avatar Sep 05 '22 12:09 psychocoderHPC

There is a high possibility that rocrand is not supported for gfx1030.

That would be weird since it is a default target in rocRAND's CMake setup:

https://github.com/ROCmSoftwarePlatform/rocRAND/blob/develop/CMakeLists.txt#L82

j-stephan avatar Sep 05 '22 12:09 j-stephan

Node this error happened too if you compile host code with hipcc where rocrand_kernel.h is included. If I not forget this issue I will write a miniapp and open an issue for AMD.

I assume a hello world for CPU + the include compiled with hipcc will reproduce the warning.

psychocoderHPC avatar Sep 05 '22 12:09 psychocoderHPC