HIP icon indicating copy to clipboard operation
HIP copied to clipboard

Inclusion of cuda_wrappers headers needs to be fixed.

Open Artem-B opened this issue 4 years ago • 17 comments

https://github.com/ROCm-Developer-Tools/HIP/blob/dabba8a3678f8cf57994feaa5dfe5591621fb6ce/include/hip/hcc_detail/hip_runtime.h#L517

#include <__clang_cuda_math_forward_declares.h>
#include <__clang_cuda_complex_builtins.h>
// Workaround for using libc++ with HIP-Clang.
// The following headers requires clang include path before standard C++ include path.
// However libc++ include path requires to be before clang include path.
// To workaround this, we pass -isystem with the parent directory of clang include
// path instead of the clang include path itself.
#include <include/cuda_wrappers/algorithm>
#include <include/cuda_wrappers/complex>
#include <include/cuda_wrappers/new>

This is bound to break in interesting ways (already does for us internally).

Those headers are intended to be used with CUDA and they assume very particular location in the clang's include paths. The assertion Following headers requires clang include path before standard C++ include path is false. It's the clang_wrappers that must be included before the standard C++ library include path. The standard clang headers do not.

Here's what canonical CUDA include paths look like:

 /usr/local/google/home/tra/work/llvm/build/debug/lib/clang/11.0.0/include/cuda_wrappers
 /usr/local/cuda/include
 /usr/local/google/home/tra/work/llvm/build/debug/bin/../include/c++/v1
 /usr/local/include
 /usr/local/google/home/tra/work/llvm/build/debug/lib/clang/11.0.0/include
 /usr/include/x86_64-linux-gnu
 /usr/include

Better way to address this is to teach clang that HIP compilation needs to add cuda_wrappers path in the right place. Here's where it's done for CUDA: https://github.com/llvm/llvm-project/blob/74e6a5b2a3b0e1992fb38c986b18f1aa7ca6fabd/clang/lib/Driver/ToolChains/Cuda.cpp#L239 HIP will need something similar. AFAICT, there's no reliable enough way to reliably control placement of cuda_wrappers in the search path from command line.

Directly including cuda_wrappers headers (and only three of them) relies on clang's implementation details and may be broken at any point in time. Adding include path to the parent directory is also unreliable. E.g during distributed build cuda_wrappers subdirectory may be placed in a directory structure constructed via symlinks, so a simple cuda_wrappers/../.. may be pointing somewhere with no include/cuda_wrappers/ in it and user has no way to know where exactly compiler's headers are located ahead of time and the location may be different on different build nodes. The compiler on that machine does -- another point for teaching clang that HIP needs cuda_wrappers.

Another issue is that these wrapper headers are written with the assumption of being included from a CUDA compilation. They assume that that all CUDA-specific macros have already been defined by __clang_cuda_runtime_wrapper.h which is force-included for all CUDA compilations. As things are now, once cuda_wrappers is in the correct place in the search path, '#include ' will pick the wrapper first and that would fail because things like __device__ are not defined. The wrappers must work correctly if someone includes them from a HIP compilation, but before hip_runtime.h. E.g. rocm_hip/tests/src/kernel/hipPrintfKernel.cpp https://github.com/ROCm-Developer-Tools/HIP/blob/dabba8a3678f8cf57994feaa5dfe5591621fb6ce/tests/src/kernel/hipPrintfKernel.cpp#L28 does that via test_common.h -> iostream -> [lots of libc++ headers] -> new.

Artem-B avatar May 14 '20 21:05 Artem-B

Thanks for the suggestion. I will try to fix it.

yxsamliu avatar May 15 '20 01:05 yxsamliu

I'm having an issue where hip-clang doesn't see cuda_wrappers at all, i can't find them on my sytem either. Where do these come from?

fuag15 avatar Jul 26 '21 00:07 fuag15

The wrappers are part of clang installation. They should be in the same directory where clang finds its other header files. https://github.com/llvm/llvm-project/tree/main/clang/lib/Headers

Artem-B avatar Jul 26 '21 00:07 Artem-B

One way to find the path is to run clang -v -x c++ /dev/null -fsyntax-only and check the list of include paths. The path to clang's will be towards the end of the list , after the C++ library headers, but before the /usr/include

Artem-B avatar Jul 26 '21 00:07 Artem-B

interesting, thanks for the info @Artem-B I might be chasing the wrong issue

my installation of clang did not have the project active to get the cuda_wrappers included in the installation. I fixed that but...

It seems for whatever reason miopen is generating compilation commands that are somehow missing -isystem directories and also include extra ones.

There's an -isystem on the command line to compile naive_conv from pytorch that is including hip_runtime.h that can't find inlclude/cuda_wrappers/algorithm although that is in my llvm path. Additionally the compilation flags have an extra -isystem /usr/include that cause #include_next to fail to find math.h. I've manually removed that -isystem for now and am attempting to debug but I'm not sure If i should by chasing my llvm-roc, hip, or miopen configurations. :|

fuag15 avatar Jul 26 '21 06:07 fuag15

Can you install latest ROCm release? It should have correct configuration for MIOpen, HIP and llvm.

yxsamliu avatar Jul 26 '21 13:07 yxsamliu

@yxsamliu @Artem-B I'm working on gentoo ebuilds for 4.2.0. I got to the source of the problem and am dumping the info here incase anyone else runs into it as they are likely to also run across the wrapper include confusion.

MIOpen pulls flags for hip-clang from hip (or rocclr). It pulls these through flags defined in the cmake-targets of hip::device. In gentoo, the builds dont keep rocm libs / headers under a prefix of /usr/ unless it has a conflict with a system tool like llvm or its own set of library includes like hip. This means the hsa headers for us go int /usr/include like everything else. Hip / RocCLR has things in its targets that prepend the hsa header include paths as system include paths which for us generates an -isystem /usr/include. This causes #include_next directives to fail in hip. The fix for these issues in our configuration was 2 fold

  1. patch out the HSA include logic from hip-targets.cmake.in
  2. patch out the extra include infront of cuda_wrappers in hip_platform

*NOTE: the for cuda_wrappers headers installed with llvm-roc the openmp project targets must be set.

Hopefully this snippet helps someone else debugging issues that are similar.

fuag15 avatar Jul 26 '21 20:07 fuag15

I'm not at the bottom of this yet, i'm having more #include_next issues now. I'm kinda lost as to why an extra -isystem to /usr/include would make #include_next statements fail to find files that are present within /usr/include

fuag15 avatar Jul 26 '21 21:07 fuag15

See the details in the body of the bug report:

AFAICT, there's no reliable enough way to reliably control placement of cuda_wrappers in the search path from command line.

You may hack things into working, but I would not be surprised if your build would work slightly differently compared to the one which lets clang handle include search order.

E.g. if the wrappers happen to get included after some critical header because it's found earlier in the search path, you may end up with a different subset of the standard library available on GPU and that would be observable from C++.

The fact that you are struggling with the search path suggests that there's something odd going on with your system. Could you elaborate on why the special include paths are needed at all?

hsa headers for us go int /usr/include like everything else

This could be a problem as it might interfere with the inclusion order.

I would suggest using an approach used on some linux distributions for CUDA where the defaullt packages install CUDA headers in the same directory as the other packages. Basically, you create a fake monolithic rocm install directory populated with symlinks to the correct files/directories and then point clang there with --rocm-path.

Artem-B avatar Jul 27 '21 16:07 Artem-B

Did you see -internal-isystem ${llvm_prefix}/lib/clang/12.0.0/include/cuda_wrappers -internal-isystem ${rocm_prefix}/include -include __clang_hip_runtime_wrapper.h when you use -v ?

Normally clang detects ROCm installation path relative to itself and add these options, then hip_runtime.h should not trigger include_next since they should be disabled by https://github.com/ROCm-Developer-Tools/HIP/blob/rocm-4.2.x/include/hip/amd_detail/hip_runtime.h#L373

yxsamliu avatar Jul 27 '21 16:07 yxsamliu

Hmm, i got it to compile and work but, it seems to be doing great at everything pytorch can throw at it but I'm worried given what you both said something is wrong with the system paths. Here are the changes I ended up rolling with.

There's config in llvm-roc to add include/nvidia_wrappers to path already that was commented out in the base ebuild I was working with. I undid that comment so now it does have an isystem of include/cuda_wrappers. When you're asking what the output of -v is what tool are do mean using -v on?

It seems like I should change these ebuilds to go with the "standard" rocm layout and leave things in /usr/rocm with sym links elsewhere.

fuag15 avatar Jul 28 '21 04:07 fuag15

-v is an option for clang.

For HIP to work properly clang needs to be able to detect HIP installation. If detected, clang -v will have output like:

Found HIP installation: /opt/rocm-4.2.0, version 4.2.21155-37cb3a34

clang expects the relative location of itself, HIP runtime and ROCm device library follow the ROCm installation directory structure, i.e.

/opt/rocm/llvm => compiler /opt/rocm/amdgcn => device library /opt/rocm/hip => HIP runtime

As long as the relative location is correct, clang should be able to detect ROCm installation.

yxsamliu avatar Jul 28 '21 15:07 yxsamliu

here's an output of that command

sr/lib/llvm/roc/bin/clang -v
clang version 12.0.0
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/lib/llvm/roc/bin
Selected GCC installation: /usr/lib/gcc/x86_64-pc-linux-gnu/10.3.0
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Selected multilib: .;@m64
Found HIP installation: /usr, version 3.5.0

the paths over here are /usr/lib/llvm/roc/bin/clang => llvm-roc compiler /usr/lib/hip/lib/bin/hip => HIP Runtime /usr/lib/amdgcn => device library

fuag15 avatar Jul 28 '21 15:07 fuag15

clang can detect HIP. That is good. However, the HIP runtime seems very old. The latest release is 4.2 Which branch of HIP are you using?

yxsamliu avatar Jul 28 '21 15:07 yxsamliu

hipcc -v has trouble locating the right clang and rocm_agent_enumerator. Going to work on fixing that in the hip package but hipe does report after all the errors ,HIP version: 4.2.21301-. Gpu accelerated pytorch running the sample mnist cnn code ran over night. Thanks for all the help with figuring this stuff out. I'm going to work on changing things around to follow more along what is expected in with the rocm tools (separate rocm dir with sym links where needed)

fuag15 avatar Jul 28 '21 22:07 fuag15

This topic is not solved, right? Installing HIP in /usr/lib/x86_64-linux-gnu/libamdhip64.so and /usr/include/hip is still not supported in the upcoming ROCm 5.x? pinging @yxsamliu, @Artem-B, @fuag15, @cgmb.

Here on Debian testing (bookworm), with the debian official WIP packages installed in the above paths, got this (and compiling a compute kernel, clang fails to find math.h):

maxzor@ws:~$ clang -v -x c++ /dev/null -fsyntax-only
Debian clang version 13.0.1-+rc3-1~exp1+b1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/11
Selected GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/11
Candidate multilib: .;@m64
Selected multilib: .;@m64
Found HIP installation: /usr, version 4.4.22062
 (in-process)
 "/usr/lib/llvm-13/bin/clang" -cc1 -triple x86_64-pc-linux-gnu -fsyntax-only -disable-free -disable-llvm-verifier -discard-value-
  names -main-file-name null -mrelocation-model static -mframe-pointer=all -fmath-errno -fno-rounding-math -mconstructor-
  aliases -munwind-tables -target-cpu x86-64 -tune-cpu generic -debugger-tuning=gdb -v -fcoverage-compilation-dir=/home
  /maxzor -resource-dir /usr/lib/llvm-13/lib/clang/13.0.1 -internal-isystem /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include
  /c++/11 -internal-isystem /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/x86_64-linux-gnu/c++/11 -internal-isystem 
  /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/backward -internal-isystem /usr/lib/llvm-13/lib/clang/13.0.1
  /include -internal-isystem /usr/local/include -internal-isystem /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../x86_64-linux-
  gnu/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem 
  /usr/include -fdeprecated-macro -fdebug-compilation-dir=/home/maxzor -ferror-limit 19 -fgnuc-version=4.2.1 -fcxx-exceptions 
  -fexceptions -fcolor-diagnostics -faddrsig -D__GCC_HAVE_DWARF2_CFI_ASM=1 -x c++ /dev/null
clang -cc1 version 13.0.1 based upon LLVM 13.0.1 default target x86_64-pc-linux-gnu
ignoring nonexistent directory "/usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../x86_64-linux-gnu/include"
ignoring nonexistent directory "/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11
 /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/x86_64-linux-gnu/c++/11
 /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/backward
 /usr/lib/llvm-13/lib/clang/13.0.1/include
 /usr/local/include
 /usr/include/x86_64-linux-gnu
 /usr/include
End of search list.

Maxzor avatar Feb 09 '22 18:02 Maxzor

@Maxzor, I don't think it was fixed in 5.0 but you'll be able to check for yourself soon. Unfortunately, I've been tied up with (among other things) reviewing similar changes for the math libraries. The need for backwards-compatibility makes them non-trivial.

cgmb avatar Feb 09 '22 20:02 cgmb

@Artem-B, Sorry for the lack of response. Please try latest ROCm 6.0.2 (HIP 6.0.32831) to see if your issue still exists? If resolved, please close the ticket. Thanks.

ppanchad-amd avatar Mar 19 '24 16:03 ppanchad-amd