[Driver] Wrapper compilation drops target features
Describe the bug
Currently the DPC++ driver uses llc to lower the IR module produced by clang-offload-wrapper to an object file. When doing so it ignores target features (none of the target features from the cc1 clang invocation appear in the llc invocation). This is usually fine on x86, but other platforms (e.g. RISC-V) disallow linking object files with mismatching target features, for example compiling a program with -march="rv64id" leads to can't link soft-float modules with double-float modules.
To reproduce
The issue can be reproduced by trying to cross-compile a SYCL program for RISC-V (with the +d extension enabled). On Ubuntu 22.04:
- install a RISC-V toolchain:
sudo apt install gcc-riscv64-linux-gnu g++-riscv64-linux-gnu - build DPC++ with
--host-target="X86;RISCV"passed toconfigure.py
cross compile a SYCL program:
clang++ --target=riscv64-linux-gnu input.cpp -o out \
-march="rv64id" \
-fsycl -fsycl-targets=spir64 -isysroot=/usr/riscv64-linux-gnu
The command should fail with:
/usr/bin/riscv64-linux-gnu-ld: /tmp/sycl1-wrapper-347ec7.o: can't link soft-float modules with double-float modules
/usr/bin/riscv64-linux-gnu-ld: failed to merge target specific data of file /tmp/sycl1-wrapper-347ec7.o
clang++: error: linker command failed with exit code 1 (use -v to see invocation)
Environment
OS: Ubuntu 22.04 Target device: spirv64
Additional context
This is maybe something that we could address while switching to the clang-linker-wrapper approach in the driver. Maybe we could figure out the target features in the driver and add a --mattr flag to the llc invocation, or we could use clang instead of llc to lower the module that comes from clang-offload-wrapper. Another alternative is to generate the module in clang-offload-wrapper with the correct target features, which should allow to keep the call to llc as simple as it is now.
Hi @intel/dpcpp-clang-driver-reviewers, the issue is a bit hard to reproduce but I think you get the idea, let me know if you need more instructions. Do you have any thoughts about this?
It is in our plans to move away from using llc in our compilation flow. We are also in process to move away from the old offloading model to use the new offloading model via clang-linker-wrapper. Unfortunately, the use of llc still exists there. @asudarsa for awareness
Hi @PietroGhg
As @mdtoguchi mentioned, we do plan to move away from using llc and move to using the new offloading model. We will keep you posted on our progress.
Thanks
Sounds great, thank you @mdtoguchi @asudarsa
The better way to fix this issue before we moved to clang-linker-wrapper is to switch from llc to clang.
@PietroGhg @hvdijk just confirming, was this fixed by https://github.com/intel/llvm/pull/15259?
Thanks, I forgot there was also an issue open about this. For default features, this was fixed by #15259. For non-default flags, I will double-check soon and either update this to reflect the current state, or close it if it is fixed already.
For non-default flags, this is not yet fixed.
Taking an arbitrary "Hello, world!" SYCL program and compiling as
$ clang++ --target=riscv64-linux-gnu -mabi=lp64 -L /home/harald/dpcppllvm/build/riscv64-linux/lib -fsycl hello.cpp
/usr/bin/riscv64-linux-gnu-ld: /tmp/hello-44d0f2.o: can't link soft-float modules with double-float modules
/usr/bin/riscv64-linux-gnu-ld: failed to merge target specific data of file /tmp/hello-44d0f2.o
clang++: error: linker command failed with exit code 1 (use -v to see invocation)
With -### added to the command line, we can see that although clang is now used to process the wrapper, it uses default flags:
...
"/home/harald/dpcppllvm/build/x86_64-linux/bin/clang" "--target=riscv64-unknown-linux-gnu" "-c" "-o" "/tmp/a-940227.o" "/tmp/wrapper-1220e4.bc"
...
Note the absence of any ABI options (as well as optimization options) here: this results compiling it for the default ABI, which may not be compatible with the ABI the user specified.
The better way to fix this issue before we moved to clang-linker-wrapper is to switch from llc to clang.
Hi @bader -- does this comment suggest that moving to clang-linker-wrapper (via --offload-new-driver) better positions us to fix this issue while still using llc for the wrapper compilation step?
I ask because we have some cases where clang fails to process sufficiently large bitcode files, and llc doesn't have this issue.
Hi @bader -- does this comment suggest that moving to clang-linker-wrapper (via
--offload-new-driver) better positions us to fix this issue while still using llc for the wrapper compilation step?
Hi @tcreech-intel.
I'm not sure if simply switching to the new offload driver is enough to solve #14139. https://github.com/llvm/llvm-project/blob/main/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp#L657 - as far as I can tell, user options are not passed to the clang-linker-wrapper tool. I guess the issue should be fixed separately in the upstream.
NOTE: our downstream clang-linker-wrapper uses alternative implementation of the LLVM wrapper module compilation, which invokes clang instead of CodeGen library (as done by the upstream clang-linker-wrapper). See https://github.com/intel/llvm/blob/sycl/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp#L1187. I don't understand the reason for having SYCL specific implementation, but hopefully @YuriPlyakhin's team will fix this as part of the upstreaming effort. In any case, the SYCL implementation also doesn't apply user-provided target features.
I ask because we have some cases where clang fails to process sufficiently large bitcode files, and llc doesn't have this issue.
Is "clang fails to process sufficiently large bitcode files" related to dropping target features? Depending on the answer to this question the new offload driver solution will require either just switching wrapper module compilation from clang to the CodeGen library or passing target features from the compiler driver to the clang-liner-wrapper.
Thanks, @bader. I had not noticed that intel/llvm's clang-linker-wrapper used a different approach.
The clang issues I refer to are not related to target features. Rather, we've run into a few places where Clang tries to bail out when C/C++ sources would be too large for the frontend implementation. (For example, here.) These limits don't really apply to bitcode, of course, but we hit them when we use clang to compile wrapped bitcode because the code path is focused on sources.
The upstream approach (that is, invoking CodeGen from clang-linker-wrapper) should avoid this kind of issue. Hopefully we can move toward that approach.
The upstream approach (that is, invoking CodeGen from clang-linker-wrapper) should avoid this kind of issue. Hopefully we can move toward that approach.
Please, sync with @YuriPlyakhin on plans regarding this work. As far as I know, there are no plans to implement this solution in near future.
Will do. Thanks.