rmm
rmm copied to clipboard
WIP: Prevent path conflict in builds
Fixes #1528.
Contributes to https://github.com/rapidsai/build-planning/issues/54 and https://github.com/rapidsai/build-planning/issues/56.
Related to https://github.com/rapidsai/rapids-cmake/pull/592
Notes for Reviewers
This is not ready for review yet.
Related conversations:
- #1177
This is currently failing with the following conflicts.
(I've included just 1 example of each type below)
(CUDA 11.8 build) (CUDA 12.2 build)
fmtheaders ininclude/fmt(conflicting packages: conda-forge/fmt,librmm)
This transaction has incompatible packages due to a shared path.
packages: conda-forge/linux-aarch64::fmt-10.2.1-h2a328a1_0, file:///tmp/conda-bld-output/linux-aarch64::librmm-24.06.00a16-cuda12_240419_g9dfd9070_16
path: 'include/fmt/chrono.h'
fmtbuilds scripts inlib/cmake/fmt/ *(conflicting packages: conda-forge/fmt,librmm`)*
This transaction has incompatible packages due to a shared path.
packages: conda-forge/linux-aarch64::fmt-10.2.1-h2a328a1_0, file:///tmp/conda-bld-output/linux-aarch64::librmm-24.06.00a16-cuda12_240419_g9dfd9070_16
path: 'lib/cmake/fmt/fmt-targets.cmake'
fmtpkgconfig script (conflicting packages: conda-forge/fmt,librmm)
This transaction has incompatible packages due to a shared path.
packages: conda-forge/linux-aarch64::fmt-10.2.1-h2a328a1_0, file:///tmp/conda-bld-output/linux-aarch64::librmm-
path: 'lib/pkgconfig/fmt.pc'
spdlogheaders (conflicting packages: conda-forge/fmt,librmm)
This transaction has incompatible packages due to a shared path.
packages: conda-forge/linux-aarch64::spdlog-1.12.0-h6b8df57_2, file:///tmp/conda-bld-output/linux-aarch64::librmm-24.06.00a16-cuda12_240419_g9dfd9070_16
path: 'include/spdlog/async.h'
spdlogbuild scripts (conflicting packages: conda-forge/fmt,librmm)
This transaction has incompatible packages due to a shared path.
packages: conda-forge/linux-aarch64::spdlog-1.12.0-h6b8df57_2, file:///tmp/conda-bld-output/linux-aarch64::librmm-24.06.00a16-cuda12_240419_g9dfd9070_16
path: 'lib/cmake/spdlog/spdlogConfig.cmake'
spdlogpkgconfig script (conflicting packages: conda-forge/fmt,librmm)
This transaction has incompatible packages due to a shared path.
packages: conda-forge/linux-aarch64::spdlog-1.12.0-h6b8df57_2, file:///tmp/conda-bld-output/linux-aarch64::librmm-24.06.00a16-cuda12_240419_g9dfd9070_16
path: 'lib/pkgconfig/spdlog.pc'
I can now see this clobbering blocking RMM PRs such as #1537. Can't build RMM C++ in CI due to path conflicts for spdlog and fmt. e.g.
ClobberWarning: This transaction has incompatible packages due to a shared path.
packages: conda-forge/linux-64::fmt-10.2.1-h00ab1b0_0, file:///tmp/conda-bld-output/linux-64::librmm-24.06.00a20-cuda11_240423_ga4d6c965_20
path: 'include/fmt/args.h'
@harrism I don't think the clobbering stuff is what's causing that PR to fail (although it is generating thousands of lines of scary-looking logs 😅 ).
#1537 is failing because it removes a file but not the corresponding test in the conda recipe.
+ test -f /opt/conda/conda-bld/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pl/include/rmm/thrust_rmm_allocator.h
WARNING: Tests failed for librmm-24.06.00a20-cuda12_240423_ga4d6c965_20.tar.bz2 - moving package to /opt/conda/conda-bld/broken
Remove this test there:
https://github.com/rapidsai/rmm/blob/9e6db746f1a4a6361fb9fadf381f749dc52faaea/conda/recipes/librmm/meta.yaml#L84
Oh how did you even find that? I just saw all the ClobberWarnings.
Any way to make those failures fail with the text "error:"? This is usually what I search for in the logs.
Oh how did you even find that? I just saw all the ClobberWarnings.
I went straight to the end of the log and read back up from there until I saw something problematic. A lot of the CI scripts across RAPIDS have set -e -u -o pipefail set, so they tend to fail at the first place where something goes wrong.
I also had just looked at these tests in the conda recipe today, in the process of testing for this PR, so had some pattern recognition for what it looked like when they failed.
Any way to make those failures fail with the text "error:"? This is usually what I search for in the logs.
Not that I'm aware of. That comes from within conda itself, I don't think we can control it.
It does say "fail" which I usually search for along with "error".
...
WARNING: Tests failed for librmm-24.06.00a20-cuda11_240423_ga4d6c965_20.tar.bz2 - moving package to /opt/conda/conda-bld/broken
...
TESTS FAILED: librmm-24.06.00a20-cuda11_240423_ga4d6c965_20.tar.bz2
[rapids-conda-retry] conda returned exit code: 1
Will this make it into 24.06?
Will this make it into 24.06?
short answer
Not unless we decide that there's an urgent need for it.
long answer
The root cause of these fmt and spdlog clobbering issues across RAPIDS is "RAPIDS is carrying around patches to those libraries, so rapids-cmake always downloads them, and it places them at likely-to-cause-conflicts paths like include/fmt".
I'd started pursuing a short-term fix (upgrade to newer versions of fmt and spdlog that don't need the patches), described in https://github.com/rapidsai/build-planning/issues/56 and tested over in #1544.
Stopped short of trying to roll that out across all of RAPIDS conda packages, because doing it might lead to RAPIDS packages conflicting with conda itself and other packages from conda-forge. @bdice summarized that well here: https://github.com/rapidsai/build-planning/issues/56#issuecomment-2087365946
At that point, we paused on this to work towards other packaging priorities for this release: https://github.com/rapidsai/build-planning/issues/54#issuecomment-2093814987
I'd like to pick up a more permanent solution (RAPIDS redistributing these things when necessary, via its own conda package built from rapids-cmake) in the next release cycle.
cc @mmccarty for visibility
Thanks. Moving to backlog.
This work is paused, in favor of pursuing a better long-term solution in the future. Closing this PR for now.
Subscribe to https://github.com/rapidsai/build-planning/issues/54 and https://github.com/rapidsai/build-planning/issues/56 for updates.