Georgii Evtushenko

Results 62 comments of Georgii Evtushenko

Hello, @yboetz! In the [following PR](https://github.com/NVIDIA/cub/pull/357), I've introduced a new agent for sorting variable-length arrays in the block scope. It performs radix-based sort just like the one we use in...

If that suits your needs, we could expose this algorithm as a block-scope facility (something like `cub::BlockSort`).

@chenxuhao, I've attached a fixed version of your routine [here](https://github.com/NVIDIA/cub/issues/327#issuecomment-1250816657). Currently, we are not ready to expose it though.

Hello, @chenxuhao! Does `cub::WarpMergeSort` work for you? I don't think you need and agent for that. Agent's are used to share implementation of device-scope algorithms in CUB.

My only concern regarding `static` specifier for kernels is that we'll drastically increase binary size: ```cpp #pragma once #ifdef STATIC #define SPECIFIER static #else #define SPECIFIER #endif template SPECIFIER __global__...

I can see a lot of compilation errors when build with CTK 11.0, could you take a look? ```bash cub/cub/cmake/../../cub/device/dispatch/dispatch_scan.cuh:169:57: error: unused parameter ‘d_in’ [-Werror=unused-parameter] 169 | __global__ void DeviceScanKernel(InputIteratorT...

> Huge +1 from me. I experimented with using Catch2 in [cuCollections ](https://github.com/NVIDIA/cuCollections/tree/dev/tests) and I have loved it. > > I know CUB doesn't currently use GTest, but many of...

> > Have you encountered an issue related to Catch2 usage in `.cu` files? > > All the test files in cuCollections are `.cu` files: https://github.com/NVIDIA/cuCollections/tree/dev/tests > > The is...

> For some algorithms it makes no sense to have a whole block of data in registers at once. For others a local buffer is bad due to dynamic indexing....