cudf
cudf copied to clipboard
adding wheel build for libcudf
Description
This is meant to add a C++ shared library wheel as an alternative to the statically linked self-contained wheel. It is part of https://github.com/rapidsai/build-planning/issues/33
Checklist
- [x] I am familiar with the Contributing Guidelines.
- [x] New or existing tests cover these changes.
- [x] The documentation is up to date with these changes.
Stuck on dlpack here. The failure is:
/__w/cudf/cudf/python/cudf/build/cp310-cp310-manylinux_2_28_aarch64/cudf/_lib/interop.cxx:1443:10: fatal error: dlpack/dlpack.h: No such file or directory
1443 | #include "dlpack/dlpack.h"
| ^~~~~~~~~~~~~~~~~
compilation terminated.
https://github.com/rapidsai/cudf/actions/runs/8662725105/job/23755706695?pr=15483#step:9:290
@vyasr the build is passing now, but the tests seem to show that the library (libcudf.so) can't be found. It does seem like libcudf-cuXY is being installed OK, so I'm thinking it has to be some kind of RPATH issue. I don't see how you've addressed this in the RMM PR. Any advice?
I've inspected one of the libcudf wheels, and it has both lib and lib64 folders. The lib folder has nvcomp stuff, and lib64 has libcudf.so.
You won't see this in the rmm PR because rmm is header-only, so there is no library there. The raft PR is a more useful example here. What you want to look at is actually the Python components of the libraft library, specifically the load.py
module. The idea is that we never set RPATHs on libraries explicitly. Instead, we use ctypes to dlopen the library before we import any of the extension modules. This is a more robust solution than setting RPATHS because of the various ways in which it is possible for Python packages to coexist on a user's system (different environments, PYTHONPATH
modification, etc).
You have to set the variables before the rapids-cmake clone happens, which in this case is in the include(../../rapids_config.cmake)
line. IOW the variables as currently set won't have any effect. Also, I think you'll need to specify the repo as vyasr/rapids-cmake
, not just vyasr
.
This is annoying: we've fixed directory, but there are now failures in the Python build trying to find nvcomp. Explanation is here.
It looks like my last nvcomp patch worked! The Python wheels built as well.
This pull request requires additional validation before any workflows can run on NVIDIA's runners.
Pull request vetters can view their responsibilities here.
Contributors can view more details about this message here.
/ok to test
There's something wrong with RBB and pyarrow here. Here's the traceback: https://github.com/rapidsai/cudf/actions/runs/9653861106/job/26627010517#step:9:214
pyproject.toml has:
tool.rapids-build-backend]
requires = [
"cmake>=3.26.4",
"cython>=3.0.3",
"ninja",
"numpy==1.23.*",
"pyarrow==16.1.0.*",
"rmm==24.8.*,>=0.0.0a0",
] # This list was generated by `rapids-dependency-file-generator`. To make changes, edit ../../dependencies.yaml and run `rapids-dependency-file-generator`.
which I think should be the right thing here. Maybe that's not the pyproject.toml that exists at the time of build, though. Is there any way to have RBB act in "debug mode" where it outputs the pyproject.toml that it has generated? @KyleFromNVIDIA
Is there any way to have RBB act in "debug mode" where it outputs the pyproject.toml that it has generated?
None currently. What are you thinking this would look like? Writing the file to stdout? Copying it to an additional location? Leaving it as it wrote it after build is complete and not restoring it to its original state?
Writing the file to stdout was my initial idea, but I'm open to whatever you think is best and/or easiest to do.
Lots of problems with codecov stuff, which seem to be happening on other repos. There was one possibly concerning test failure:
FAILED python/cudf/cudf_pandas_tests/test_cudf_pandas.py::test_rmm_option_on_import[pool] - AssertionError: assert -8 == 0
+ where -8 = CompletedProcess(args=['python', '-m', 'cudf.pandas', '/__w/cudf/cudf/python/cudf/cudf_pandas_tests/data/profile_basic.py'], returncode=-8, stdout='', stderr='').returncode
FAILED python/cudf/cudf_pandas_tests/test_cudf_pandas.py::test_rmm_option_on_import[managed_pool] - AssertionError: assert -8 == 0
+ where -8 = CompletedProcess(args=['python', '-m', 'cudf.pandas', '/__w/cudf/cudf/python/cudf/cudf_pandas_tests/data/profile_basic.py'], returncode=-8, stdout='', stderr='').returncode
Could you update the PR description now?
Please open a PR at https://github.com/rapidsai/rapids-metadata to reflect the new project structure. This can be the same PR as the one that updates cugraph.
@KyleFromNVIDIA https://github.com/rapidsai/rapids-metadata/pull/18
@vyasr the description is up-to-date. Thanks for the reminder.
I'm picking this up, moving it back to "draft" for now until I get the tests passing, then will put it back up for review.
I've retargeted this at branch-24.10
and updated all the version references.
JFYI I'm expecting #16299 to get done before this PR, which means there will probably need to be some associated updates made here. On the off chance that the situation ends up reversed, we can work on updating that PR accordingly.
@jameslamb OK #16299 is merged now so you've got a bunch of conflicts here. Let me know if you have any issues in resolving them!
Thanks @vyasr .
If you have time to be a quick ci-codeowners
reviewer for me, I'd appreciate a review on #16575. If that's merged, it'll also have some conflicts with this PR. No worries if you don't have time, not a big set of changes either way.
Sure thing, looking now.
Alright, I think this is ready for review!
I've updated the PR description with some relevant details.
I've gone through the prior review threads and resolved any that are now resolved. Anything still open is something where I'm specifically looking for reviewer feedback (except https://github.com/rapidsai/cudf/pull/15483#discussion_r1723657818, which is some pre-commit hook behavior I need to investigate).
I understand devcontainers
changes will be necessary as well... I'll get those staged in a PR into https://github.com/rapidsai/devcontainers.
I've gone through and responded to the many threads where there was open discussion that needed some feedback from me, but I haven't reviewed yet. Will do that soon.
devcontainers job passed after merging https://github.com/rapidsai/devcontainers/pull/271,
https://github.com/rapidsai/cudf/actions/runs/10515765176/job/29173481700?pr=15483
😁 thanks @AyodeAwe
Given that + all the approvals here, I'm gonna merge this. Thanks so much for all the help everyone!!!
/merge