cudf
cudf copied to clipboard
Add patch for incorrect cuco noexcept clauses
Description
cuco previously marked a number of methods as noexcept that can in fact throw exceptions. This causes problems for cudf functions that call these methods. The issue was fixed in cuco upstream, but we cannot easily update to the latest commit of cuco, especially in a patch fix for 24.06. This PR instead adds a rapids-cmake patch for the cuco clone to address this issue. The patch may be removed once we update to a commit of cuco that contains the necessary fix.
Resolves #16059
Checklist
- [x] I am familiar with the Contributing Guidelines.
- [x] New or existing tests cover these changes.
- [x] The documentation is up to date with these changes.
I have confirmed in a local build that the sample in #16059 throws a std::alloc and then aborts without this PR, while adding these changes allows a graceful failure (note that if you use that sample directly as a gtest, the tests cannot be run with ctest because that seems to mask the abort; the test executable must be run directly). We'll probably want to put a bit of thought into how we want to test this kind of issue more systematically, and I didn't want to include a crude test that would potentially OOM other workers in CI, so I haven't included any test directly in this PR. That's something we will want to discuss further for 24.08.
@vyasr if a hotfix is coming for 24.06, would it be possible to also include #16038? That was a fix for a functional regression that we ended up needing to create our own libcudf patched build to pull in as well.
@kkraus14 Yes, we’re tracking these hotfix candidates for a 24.06.01 release:
- cuCo noexcept backport patch https://github.com/rapidsai/cudf/pull/16077
- Decimal/float AST cast fix https://github.com/rapidsai/cudf/pull/16101 (original 24.08 fix in #16038)
- Decimal/float AST cast test https://github.com/rapidsai/cudf/pull/16102 (original 24.08 fix in #16045)
- Conditional join segfault https://github.com/rapidsai/cudf/pull/16100 (original 24.08 fix in https://github.com/rapidsai/cudf/pull/16094)
- Conditional join illegal memory access https://github.com/rapidsai/cudf/pull/16133 (original 24.08 fix in #16127)
Thanks to your team for communicating these issues as you find them, the reproducible tests help a lot with estimating severity and identifying solutions.
@kkraus14 Yes, we’re tracking these hotfix candidates for a 24.06.01 release:
The 24.06.01 hotfix is released with all of the PRs listed here!
Conda packages and wheels are available now. Docker images are mirroring and will be updated in ~2 hours.