Georgii Evtushenko

Results 62 comments of Georgii Evtushenko

Hello @upsj! A few ideas on this: - You could add `BlockLoadAlgorithm` as a template parameter to `AgentReducePolicy ` - I wouldn't advice to use `LoadDirectBlocked` in this context. Current...

> The performance results are interesting with the mix of better and worse. My interpretation of that is that some of the functions are not as well tuned as they...

Testing revealed some issues of this approach. We can't simply remove `__launch_bounds__`, here's a reproducer that answers the question why it's the case: ```cpp #include #include using MaxPolicyT = cub::DispatchRadixSort::MaxPolicy;...

@jrhemstad I agree with your point, thanks! I'll probably try to clamp the threads block size.

Hello @AKKamath! Thank you for taking the time to report this. [It's not the only place](https://github.com/NVIDIA/cub/issues/345) with misused cache operators semantic in CUB. We have plans to clean up this...

Created a separate [issue](https://github.com/NVIDIA/cub/issues/560) to track migration to libcu++ atomics. Closing this one. @AKKamath if there are any action items that I missed, please, feel free to reopen.

Hello, @lkskstlr! Thank you for your feedback. Unfortunately, your code snippet is insufficient to reproduce this error. I've extracted CUB related parts in the following code: ```cuda #include #include #include...

Hello, @RaulPPelaez, @YinLiu-91! Thank you for noting this. Unfortunately, this behaviour is expected but ill documented. First of all, if we partition a hardware warp into tiles of any non-power-of-two...

Hello, @michaelmigliore! We were posponing `cub::DeviceSpmv` deprecation for a while. At the moment we are going to [deprecate](https://github.com/NVIDIA/cccl/issues/896) it in CUB 2.1. Therefore, I don't anticipate any efforts on this...

We might consider a generalized version of this API. The original issue looks like this. ![image](https://user-images.githubusercontent.com/9890394/121337875-a68a8180-c925-11eb-86d1-85d9767ea2ee.png) It's helpful to have a mapping for ranges within sources and destinations. In this...