alidist
alidist copied to clipboard
Bump GCC to the latest version
@davidrohr @pzhristov @aalkin as we discussed. This hopefully will allow us to bump to arrow 10 on linux.
@davidrohr @shahor02 is /System/Volumes/Data/build/alice-ci-workdir/alidist-o2/sw/SOURCES/O2/4678/0/Detectors/CTF/test/test_ctf_io_tpc.cxx:134: [1;31;49merror: in "CTFTest": check memcmp(vecIn.data(), bVec.data(), bVec.size()) == 0 has failed
known?
For progress on FLPs see: OCONF-720
Seems like build errors in O2 are caused by some flag changes/improvements in GCC.
In GCC 11, -Wall
now includes -Wrange-loop-construct
, which is causing some build errors due to -Werror
.
GCC 11 also enhanced -Wmaybe-uninitialized
, so that finds a few new cases now.
I'm not sure what's wrong with the GPU build.
I'm not sure what's wrong with the GPU build.
Hm, not sure, since it worked for me locally. But we want to bump ROCm anyway soon. Will create a new container for that then. Hopefully it will solve the compilation issues.
@davidrohr apart from the DataDistribution issues there seems to be an error when compiling GPU code via hipcc:
: && /opt/rocm/bin/hipcc -fPIC -O2 -std=c++17 -fgpu-defer-diag -mllvm -amdgpu-enable-lower-module-lds=false -Wno-invalid-command-line-argument -Wno-unused-command-line-argument -Wno-invalid-constexpr -Wno-ignored-optimization-argument -Wno-unused-private-field --amdgpu-target=gfx906 -fgpu-flush-denormals-to-zero -fgpu-rdc -O2 -g -DNDEBUG -Wno-unknown-warning-option --amdgpu-target=gfx906 GPU/GPUbenchmark/hip/CMakeFiles/O2exe-gpu-memory-benchmark-hip.dir/benchmark.hip.cxx.o GPU/GPUbenchmark/hip/CMakeFiles/O2exe-gpu-memory-benchmark-hip.dir/Kernels.hip.cxx.o -o stage/bin/o2-gpu-memory-benchmark-hip -Wl,-rpath,/sw/slc8_x86-64/boost/v1.75.0-local1/lib:/sw/slc8_x86-64/ROOT/v6-26-10-alice5-local1/lib:/opt/rocm/lib::::::::::::::::::::::::: /sw/slc8_x86-64/boost/v1.75.0-local1/lib/libboost_program_options.so.1.75.0 /sw/slc8_x86-64/ROOT/v6-26-10-alice5-local1/lib/libTree.so.6.26.10 /opt/rocm/lib/libamdhip64.so.5.1.50102 /opt/rocm/llvm/lib/clang/14.0.0/lib/linux/libclang_rt.builtins-x86_64.a /sw/slc8_x86-64/ROOT/v6-26-10-alice5-local1/lib/libMathCore.so.6.26.10 /sw/slc8_x86-64/ROOT/v6-26-10-alice5-local1/lib/libImt.so.6.26.10 /sw/slc8_x86-64/ROOT/v6-26-10-alice5-local1/lib/libMultiProc.so.6.26.10 /sw/slc8_x86-64/ROOT/v6-26-10-alice5-local1/lib/libNet.so.6.26.10 /sw/slc8_x86-64/ROOT/v6-26-10-alice5-local1/lib/libRIO.so.6.26.10 /sw/slc8_x86-64/ROOT/v6-26-10-alice5-local1/lib/libThread.so.6.26.10 -lpthread /sw/slc8_x86-64/ROOT/v6-26-10-alice5-local1/lib/libCore.so.6.26.10 && :
ld.lld: error: /sw/slc8_x86-64/boost/v1.75.0-local1/lib/libboost_program_options.so.1.75.0: undefined reference to std::__throw_bad_array_new_length()@GLIBCXX_3.4.29 [--no-allow-shlib-undefined]
ld.lld: error: /sw/slc8_x86-64/ROOT/v6-26-10-alice5-local1/lib/libTree.so.6.26.10: undefined reference to std::__throw_bad_array_new_length()@GLIBCXX_3.4.29 [--no-allow-shlib-undefined]
ld.lld: error: /sw/slc8_x86-64/ROOT/v6-26-10-alice5-local1/lib/libTree.so.6.26.10: undefined reference to std::__istream_extract(std::istream&, char*, long)@GLIBCXX_3.4.29 [--no-allow-shlib-undefined]
ld.lld: error: /sw/slc8_x86-64/ROOT/v6-26-10-alice5-local1/lib/libMathCore.so.6.26.10: undefined reference to std::__throw_bad_array_new_length()@GLIBCXX_3.4.29 [--no-allow-shlib-undefined]
ld.lld: error: /sw/slc8_x86-64/ROOT/v6-26-10-alice5-local1/lib/libThread.so.6.26.10: undefined reference to std::condition_variable::wait(std::unique_lock<std::mutex>&)@GLIBCXX_3.4.30 [--no-allow-shlib-undefined]
ld.lld: error: /sw/slc8_x86-64/ROOT/v6-26-10-alice5-local1/lib/libThread.so.6.26.10: undefined reference to std::__throw_bad_array_new_length()@GLIBCXX_3.4.29 [--no-allow-shlib-undefined]
clang-14: error: linker command failed with exit code 1 (use -v to see invocation)
[513/4119] Building CXX object Utilities/DataSampling/CMakeFiles/O2lib-DataSampling.dir/src/DataSamplingHeader.cxx.o
[514/4119] Building CXX object DataFormats/common/CMakeFiles/O2test-commondataformat-AbstractRefAccessor.dir/test/testAbstractRefAccessor.cxx.o
[515/4119] Building CXX object Framework/GUISupport/CMakeFiles/O2test-framework-CustomGUISokol.dir/test/test_CustomGUISokol.cxx.o
does it ring any bell?
does it ring any bell?
no, as I have already written above, for me locally it does not fail :). But we will bump ROCm soon on the EPNs, then I'll bump the ROCm in the container and hope that it'll fix it. Otherwise we'll need to check in more detail. But I'm following it up.
All's good on FLP side. What's the next step to have it merged?
All's good on FLP side. What's the next step to have it merged?
We have to bump ROCm to 5.3 on the EPN farm and in the FullCI container and then retry, and if it still does not work understand why and fix it.
Okay, any ETA? We would before to do it be fore shifts start
for reference, it also fails with ROCm 5.3. Not sure what to do now. Could test with ROCm 5.4 on Alma Linux 8.7, but until we are there, it is still quite some time...
OK, fix for ROCm is here: https://github.com/AliceO2Group/AliceO2/pull/10692 @ktf : could you rebase this PR?
I resolved the conflicts of this PR. The O2 fix is merged. In my docker container, it built successfully now. So now the CI should hopefully pass.
ok, it seems we need ROCm 5.3 in addition to my fix. This will be rolled out today on the EPNs, then we can update the containers.
There were also some errors in O2Physics, for which I just opened a PR.
@TimoWilken Can you cache the PR so that we can then merge it?