Allison Piper
Allison Piper
It would be convenient to make note of the release dates for each version in the table of releases in the README.md and the entries in CHANGELOG.md, both for Thrust...
PR NVIDIA/cub#218 fixes this CUB's radix sort. We should: - [ ] Check whether Thrust's other backends handle this case correctly. - [ ] Provide a guarantee of this in...
After #1184 is merged.
Symlinks cannot be used in a cross-platform project. I'd like to remove the `cub -> dependencies/cub` symlink and have `dependencies/cub` become the one and only location of CUB inside of...
Thrust and CUB would like to use libcu++. However, these projects must support the nvc++ compiler, so they are blocked from using libcu++ features until it is usable with nvc++.
The `thread_scope` enum is [gated behind the `atomic` header](https://github.com/NVIDIA/libcudacxx/blob/bda0c48d46ff7d0e3d9dea3240426efe56db6bc7/include/cuda/std/detail/__atomic#L65-L70). The `atomic` header [emits errors when used with certain SM versions](https://github.com/NVIDIA/libcudacxx/blob/bda0c48d46ff7d0e3d9dea3240426efe56db6bc7/include/cuda/std/detail/__atomic#L9-L11). `thread_scope` is useful outside of atomics for general scope labeling....
nvcc defaults to rdc-off, nvc++ defaults to rdc-on. We need to explicitly enable or disable these flags for each CUDA target, rather than just enabling them when needed.
# Summary The user-friendly `cub::Device*` entry points into the CUB device algorithms assume that the problem size can be indexed with a 32-bit int. As evidenced by a slew of...
# Overview Some `cub::Device*` algorithms are/were documented to be run-to-run deterministic, but the implementations no longer fulfill that guarantee. This has been a major pain point for several users who...