cub
cub copied to clipboard
[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl
Currently, `cub::DeviceRadixSort` only support operating on pointers ```C++ template static CUB_RUNTIME_FUNCTION cudaError_t SortPairs (void *d_temp_storage, size_t &temp_storage_bytes, const KeyT *d_keys_in, KeyT *d_keys_out, const ValueT *d_values_in, ValueT *d_values_out, int num_items, int...
cub.test.iterator fails at runtime with NVC++. The failure occurs because of the use of texture iterators, which use CUDA textures, a feature that NVC++ does not support. CUB texture iterator...
This is blocked on NVIDIA/libcudacxx#52 and NVIDIA/libcudacxx#55. NVIDIA/libcudacxx#52 means we can just update the callsites, and not juggle multiple types (with different APIs and semantics, depending on SM version and...
# Milestone Target Summary We currently do not test on all platforms that we support. We should test each supported major version of GCC, Clang, and MSVC. Thrust has decent...
cub::BlockHistogram is not behaving as I expected so I boiled it down to a simple static data set. I have attached the file file. - 128 data points [histogram.zip](https://github.com/NVlabs/cub/files/5003868/histogram.zip) -...
I am trying to compile the following CUB test program - ``` #include int main() { uint32_t* d_samples; uint8_t* d_histogram; uint8_t* d_levels; size_t temp_storage_bytes; int num_levels; size_t num_samples; cub::DeviceHistogram::HistogramRange(nullptr, temp_storage_bytes,...
https://github.com/NVlabs/cub/blob/c3cceac115c072fb63df1836ff46d8c60d9eb304/cub/block/block_radix_rank.cuh#L353 This documentation is quite unclear: `///< [out] For each key, the local rank within the tile` E.g. does it mean the number of previous keys with the same value...
https://github.com/NVlabs/cub/blob/c3cceac115c072fb63df1836ff46d8c60d9eb304/cub/block/block_radix_rank.cuh#L353 This documentation is quite unclear: `///< [out] For each key, the local rank within the tile` E.g. does it mean the number of previous keys with the same value...
https://github.com/NVIDIA/cub/blob/571aab900cc1d9741d93013ceaffe38d7e6e3b50/cmake/CubBuildCompilerTargets.cmake#L115 Hello folks, using CUDA 10.1.* the compilation fails because the compiler does not recognize `--promote_warnings` as a valid compiler flag. I believe something like the following should fix it....
The C++ assignment operator presumes the left-hand side is in a valid state, but for many places in CUB code, the left-hand side is uninitialized memory. This causes failures for...