llvm-project icon indicating copy to clipboard operation
llvm-project copied to clipboard

Link to LLVM failed

Open xuantengh opened this issue 2 years ago • 22 comments

I'm trying to build comgr from source, with the following cmake config:

cmake -S . -B build \
-DCMAKE_BUILD_TYPE=Release \
-DAMDDeviceLibs_DIR=$HOME/opt/rocm/5.4.3/lib/cmake/AMDDeviceLibs \
-DLLD_DIR=$HOME/opt/llvm/15.0.7/lib/cmake/lld \
-DClang_DIR=$HOME/opt/llvm/15.0.7/lib/cmake/clang \
-DROCM_DIR=$HOME/opt/rocm/5.4.3/share/rocm/cmake \
-DCMAKE_INSTALL_PREFIX=$PWD/install

cmake --build build

There are linking errors reporting undefined references to LLVM library during building:

/usr/bin/ld: CMakeFiles/amd_comgr.dir/src/comgr-metadata.cpp.o: in function `COMGR::metadata::getMetadataRoot(COMGR::DataObject*, C
OMGR::DataMeta*)':
comgr-metadata.cpp:(.text+0x2b5): undefined reference to `llvm::object::object_category()'
/usr/bin/ld: comgr-metadata.cpp:(.text+0x2cf): undefined reference to `llvm::StringError::StringError(llvm::Twine const&, std::erro
r_code)'
/usr/bin/ld: comgr-metadata.cpp:(.text+0x2f9): undefined reference to `llvm::object::object_category()'
/usr/bin/ld: comgr-metadata.cpp:(.text+0x313): undefined reference to `llvm::StringError::StringError(llvm::Twine const&, std::erro
r_code)'
/usr/bin/ld: comgr-metadata.cpp:(.text+0x4ad): undefined reference to `llvm::object::object_category()'
/usr/bin/ld: comgr-metadata.cpp:(.text+0x4c2): undefined reference to `llvm::StringError::StringError(llvm::Twine const&, std::erro
r_code)'
/usr/bin/ld: comgr-metadata.cpp:(.text+0x4ec): undefined reference to `llvm::object::object_category()'
/usr/bin/ld: comgr-metadata.cpp:(.text+0x501): undefined reference to `llvm::StringError::StringError(llvm::Twine const&, std::erro
r_code)'
/usr/bin/ld: comgr-metadata.cpp:(.text+0x715): undefined reference to `llvm::object::object_category()'
/usr/bin/ld: comgr-metadata.cpp:(.text+0x72f): undefined reference to `llvm::StringError::StringError(llvm::Twine const&, std::erro
r_code)'
/usr/bin/ld: comgr-metadata.cpp:(.text+0x759): undefined reference to `llvm::object::object_category()'
/usr/bin/ld: comgr-metadata.cpp:(.text+0x773): undefined reference to `llvm::StringError::StringError(llvm::Twine const&, std::erro
r_code)'
/usr/bin/ld: comgr-metadata.cpp:(.text+0x8fd): undefined reference to `llvm::object::object_category()'
/usr/bin/ld: comgr-metadata.cpp:(.text+0x917): undefined reference to `llvm::StringError::StringError(llvm::Twine const&, std::erro
r_code)'
/usr/bin/ld: comgr-metadata.cpp:(.text+0x941): undefined reference to `llvm::object::object_category()'
/usr/bin/ld: comgr-metadata.cpp:(.text+0x95b): undefined reference to `llvm::StringError::StringError(llvm::Twine const&, std::erro
r_code)'
/usr/bin/ld: CMakeFiles/amd_comgr.dir/src/comgr-metadata.cpp.o: in function `COMGR::metadata::getELFObjectFileBase(COMGR::DataObjec
t*)':

I'm using the latest version AMD LLVM and device libs from the amd-stg-open branch. The full build log is available build.log. Is there any approach to tackle this? Thanks in advance.

xuantengh avatar Mar 04 '23 02:03 xuantengh

The AMD LLVM project is config as:

cmake -S llvm -B build -G "Ninja" \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_C_COMPILER=gcc \
-DCMAKE_CXX_COMPILER=g++ \
-DBUILD_SHARED_LIBS \
-DLLVM_ENABLE_LIBCXX=ON \
-DLLVM_ENABLE_PROJECTS="llvm;clang;lld" \
-DLLVM_ENABLE_RUNTIMES="compiler-rt;libc;libcxx;libcxxabi" \
-DCMAKE_INSTALL_PREFIX=$HOME/opt/rocm/5.4.3 \
-DLLVM_TARGETS_TO_BUILD="X86;AMDGPU"
cmake --build build
cmake --build build --target install

This issue could be fixed by disabling the shared library support of the AMD LLVM project (i.e., removing the -DBUILD_SHARED_LIBS option). I'm not suring why enabling this option will fail comgr building.

xuantengh avatar Mar 04 '23 07:03 xuantengh

Currently building Comgr against an LLVM with -DBUILD_SHARED_LIBS=ON is not supported. There has been an ongoing effort to switch to shared/dynamic libraries, and it is almost but not quite ready.

I will update here once shared library support is ready, but for now you should try building with static LLVM libraries.

lamb-j avatar Mar 27 '23 21:03 lamb-j

Hi. I am facing a similar issue. I get undefined symbol errors like:

ld.lld: error: undefined symbol: llvm::cl::ResetAllOptionOccurrences()
>>> referenced by comgr.cpp:366 (/builddir/build/BUILD/ROCm-CompilerSupport-rocm-5.7.1/lib/comgr/src/comgr.cpp:366)
>>>               CMakeFiles/amd_comgr.dir/src/comgr.cpp.o:(COMGR::clearLLVMOptions())

Indeed by setting -DBUILD_SHARED_LIBS=OFF, the build does pass.

The strange thing is the Arch package does NOT use this: https://gitlab.archlinux.org/archlinux/packaging/packages/comgr/-/blob/a673a34303cc87e813784a7a905da539ee92ac06/PKGBUILD

What is the impact of using -DBUILD_SHARED_LIBS=OFF for the rest of the stack build?

squid-f avatar Oct 30 '23 20:10 squid-f

What is the impact of using -DBUILD_SHARED_LIBS=OFF for the rest of the stack build?

I have not yet faced any issue related to this.

xuantengh avatar Oct 31 '23 02:10 xuantengh

Thanks @Huangxt57 for your prompt reply. I have built the stack up to https://github.com/ROCm-Developer-Tools/clr to try to bring OpenCL to my RX6600. and it does not work (yet). The thing is to build clr, -DBUILD_SHARED_LIBS=OFF is also required. Actually, I start getting a detection issue with clinfo provided by ROCm. As there is no .so created but a .a instead, should I point to this .a in the icd file?

squid-f avatar Oct 31 '23 19:10 squid-f

Hi. The build of rocm-clr with -DBUILD_SHARED_LIBS=OFF fails to provide OpenCL support; reported in https://github.com/ROCm-Developer-Tools/clr/issues/26

I decided then to patch rocm-clr to be able to get libamdocl64.so : rocm-compilersupport-5.7.1-allow-lld-undefined.patch.txt

What could be the drawback by doing so?

squid-f avatar Nov 19 '23 17:11 squid-f

@squid-f What's your final objective here? Are you just trying to get a build of Comgr and CLR working together?

You should be able to build both CLR and Comgr with their default "BUILD_SHARED_LIBS" settings. I don't normally set either when building the two separate projects.

For CLR, I use something like the following to build:

cmake -DCLR_BUILD_HIP=ON -DCLR_BUILD_OCL=ON -DHIP_COMMON_DIR=/path/to/hip -DCMAKE_INSTALL_PREFIX=$PWD/install ..
make -j32
make install

For Comgr:

export LLVM_PROJECT=/path/to/llvm-project/build (AMD LLVM branch)
export DEVICE_LIBS=/path/to/llvm-project/amd/device-libs/build
cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_PREFIX_PATH="$LLVM_PROJECT;$DEVICE_LIBS" ..

lamb-j avatar Nov 20 '23 16:11 lamb-j

Hi @lamb-j My final goal is to bring OpenCL suppor to AMD GPU NAVI architectures to Mageia Linux. For that, I have followed the Arch packaging strategy and built the following packages: Rocm-llvm, Rocm-cmake, Rocm-core, Hsakmt, Rocm-device-libs, Rocm-compilersupport (comgr), Rocm-runtime (hsa-rocr), Rocminfo, Rocm-clr.

As @Huangxt57 , Comgr doesn't build with -DBUILD_SHARED_LIBS=ON. But, if I use -DBUILD_SHARED_LIBS=OFF, I cannot build CLR later on... So, I patched Comgr to buid it with -DBUILD_SHARED_LIBS=ON (or, as you, it is equivalent to build without setting it). However, my GPU is not yet detected as a device within the platform (as explained in https://github.com/ROCm-Developer-Tools/clr/issues/26 ).

Any clue why?

Thanks

squid-f avatar Nov 20 '23 21:11 squid-f

I'd probably suggest to follow Fedora if you need to stick to shared libs. Fedora uses the upstream LLVM release branches, but RadeonOpenCompute/llvm-project follows llvm-project upstream's trunk ('main" branch), so the API is unstable, defeating the benefit of shared libs. If you want a more stable LLVM experience, I highly recommend following Fedora in this regard.

If you can build with shared libs OFF, then you should be able to use RadeonOpenCompute/llvm-project without issue as long as they are from the same release branch, e.g. "5.7.x".

As for building from amd-stg-open, you'll have more of a mixed bag on what will and will not compile against clr. In theory, you should be able to build clr's develop branch against amd-stg-open, but your millage may vary.

Mystro256 avatar Nov 20 '23 22:11 Mystro256

-DLLD_DIR=$HOME/opt/llvm/15.0.7/lib/cmake/lld
-DClang_DIR=$HOME/opt/llvm/15.0.7/lib/cmake/clang \

I noticed you're using llvm 15, any reason the distro doesn't have 16 or 17 available?

Either way, I have some forks available if you need to use llvm upstream rather than RadeonOpenCompute/llvm-project: https://github.com/Mystro256/ROCm-Device-Libs/branches https://github.com/Mystro256/ROCm-CompilerSupport/branches

Note the llvm 17 branch should be the same as the official RadeonOpenCompute And these are the branched used by Fedora since LLVM 14.

Mystro256 avatar Nov 20 '23 22:11 Mystro256

Thanks @Mystro256 for stepping in. The first post you are referring to, is not from me. I do use LLVM17 from a rocm-llvm package, I have built specifically to build the rest of stack. I have followed here the Arch approach. The weird thing is it looks like Arch is able to build with shared libs and I can’t. Despite I use the exact same build flags… hence, the patch I had to develop.

squid-f avatar Nov 21 '23 07:11 squid-f

I think the shared library issue is orthogonal to your issue. It's fine to build Comgr as a shared library. That's the default currently, and you shouldn't need any patch or BUILD_SHARED_LIBS options to have that working. It's also fine to build CLR as a shared library. Again the default, no patches or CMake variables needed.

The one thing you can't currently do is build LLVM as a shared library and use that when building Comgr. But you'll only hit that issue if you manually set BUILD_SHARED_LIBS=ON in your LLVM build (this is what @Huangxt57 hit in this original issue), because the LLVM default is static libraries.

lamb-j avatar Nov 21 '23 15:11 lamb-j

The one thing you can't currently do is build LLVM as a shared library and use that when building Comgr. But you'll only hit that issue if you manually set BUILD_SHARED_LIBS=ON in your LLVM build (this is what @Huangxt57 hit in this original issue), because the LLVM default is static libraries.

Hi.

Here are the flags are used to build rocm-llvm 5.7.1:

-G Ninja \
        -B build \
        -S "./llvm" \
        -DCMAKE_BUILD_TYPE=RelWithDebInfo \
        -DCMAKE_INSTALL_PREFIX=%{install_prefix} \
        -DLLVM_HOST_TRIPLE=%{_host} \
        -DLLVM_DEFAULT_TARGET_TRIPLE=%{_host} \
        -DLLVM_ENABLE_PROJECTS='llvm;clang;compiler-rt;lld' \
        -DLLVM_TARGETS_TO_BUILD='AMDGPU;NVPTX;X86' \
        -DCLANG_DEFAULT_LINKER=lld \
        -DLLVM_INSTALL_UTILS=ON \
        -DLLVM_ENABLE_BINDINGS=OFF \
        -DLLVM_LINK_LLVM_DYLIB=OFF \
        -DLLVM_BUILD_LLVM_DYLIB=OFF \
        -DLLVM_LINK_LLVM_DYLIB=OFF \
        -DLLVM_ENABLE_ASSERTIONS=ON \
        -DOCAMLFIND=NO \
        -DLLVM_ENABLE_OCAMLDOC=OFF \
        -DLLVM_INCLUDE_BENCHMARKS=OFF \
        -DLLVM_BUILD_TESTS=ON \
        -DLLVM_INCLUDE_TESTS=ON \
        -DCLANG_INCLUDE_TESTS=ON \
        -DLLVM_BINUTILS_INCDIR=%{_includedir}

However, the log output shows:

/usr/bin/cmake -Wno-dev -S . -B build -DCMAKE_CXX_FLAGS_RELWITHDEBINFO:STRING=-DNDEBUG -DCMAKE_C_FLAGS_RELWITHDEBINFO:STRING=-DNDEBUG -DCMAKE_CXX_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_C_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_INSTALL_DO_STRIP:BOOL=OFF -DCMAKE_INSTALL_LIBDIR:PATH=lib64 -DCMAKE_INSTALL_LIBEXECDIR:PATH=libexec -DCMAKE_INSTALL_PREFIX:PATH=/usr -DCMAKE_INSTALL_RUNSTATEDIR:PATH=/run -DCMAKE_INSTALL_SYSCONFDIR:PATH=/etc -DINCLUDE_INSTALL_DIR:PATH=/usr/include -DLIB_INSTALL_DIR:PATH=/usr/lib64 -DSYSCONF_INSTALL_DIR:PATH=/etc -DSHARE_INSTALL_PREFIX:PATH=/usr/share -DCMAKE_BUILD_TYPE=RelWithDebInfo -DLIB_SUFFIX=64 -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON '-DCMAKE_MODULE_LINKER_FLAGS=-Wl,--as-needed  -Wl,-z,relro -Wl,-O1 -Wl,--build-id=sha1 -Wl,--enable-new-dtags' -DBUILD_SHARED_LIBS:BOOL=ON -DBUILD_STATIC_LIBS:BOOL=OFF -G Ninja -B build -S ./llvm -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_INSTALL_PREFIX=/usr/lib64/rocm/llvm -DLLVM_HOST_TRIPLE=x86_64-mageia-linux-gnu -DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-mageia-linux-gnu '-DLLVM_ENABLE_PROJECTS=llvm;clang;compiler-rt;lld' '-DLLVM_TARGETS_TO_BUILD=AMDGPU;NVPTX;X86' -DCLANG_DEFAULT_LINKER=lld -DLLVM_INSTALL_UTILS=ON -DLLVM_ENABLE_BINDINGS=OFF -DLLVM_LINK_LLVM_DYLIB=OFF -DLLVM_BUILD_LLVM_DYLIB=OFF -DLLVM_LINK_LLVM_DYLIB=OFF -DLLVM_ENABLE_ASSERTIONS=ON -DOCAMLFIND=NO -DLLVM_ENABLE_OCAMLDOC=OFF -DLLVM_INCLUDE_BENCHMARKS=OFF -DLLVM_BUILD_TESTS=ON -DLLVM_INCLUDE_TESTS=ON -DCLANG_INCLUDE_TESTS=ON -DLLVM_BINUTILS_INCDIR=/usr/include

So, it looks like somethings adds -DBUILD_SHARED_LIBS:BOOL=ON -DBUILD_STATIC_LIBS:BOOL=OFF I will investigate whether a macro from my Build System does that.

squid-f avatar Nov 22 '23 07:11 squid-f

So, conclusion, indeed, my BS was changing the default setting of rocm-llvm, which was not built with BUILD_SHARED_LIBS=OFF I fixed that to get rocm-llvm built with static libs. I was then able to build ROCm-CompilerSupport with BUILD_SHARED_LIBS=ON without any patch. I can now also build rocm-clr with BUILD_SHARED_LIBS=ON

The issue is solved for me. Thanks @lamb-j @Mystro256

I might open another report as my GPU gfx1032 (AMD Radeon RX 6600) is not found by OpenCL, still...

squid-f avatar Nov 23 '23 21:11 squid-f

Hi. I need to correct myself: thanks to rocm-llvm built with BUILD_SHARED_LIBS=OFF, I was able to build rocm-clr with BUILD_SHARED_LIBS=ON, AND, now, my GPU gfx1032 (AMD Radeon RX 6600) is found by OpenCL ! So, there is now a complete ROCm stack available for Mageia Linux.

Thanks again !

squid-f avatar Nov 24 '23 20:11 squid-f

Nice work with the investigation @squid-f, and thanks for your efforts to port ROCm to Mageia.

Still leaving this ticket open, as @Huangxt57's original issue still exists (can't build Comgr against a shared-lib LLVM)

lamb-j avatar Nov 27 '23 18:11 lamb-j