Allison Piper
Allison Piper
The sections about versioning etc towards the bottom are particularly outdated.
There are a number of tests that we currently aren't building because they are `ifdef`'d out -- @senior-zero discovered a few uncovered cases in `test_block_scan`, and @canonizer just reported another...
Per #503, it should be possible to introduce in-place versions of these algorithms.
This is blocked on NVIDIA/libcudacxx#52 and NVIDIA/libcudacxx#55. NVIDIA/libcudacxx#52 means we can just update the callsites, and not juggle multiple types (with different APIs and semantics, depending on SM version and...
# Milestone Target Summary We currently do not test on all platforms that we support. We should test each supported major version of GCC, Clang, and MSVC. Thrust has decent...
An internal user has reported a bug in `cub::DeviceHistogram`. When using 16-bit values, the computed temporary storage buffer size is too small on Pascal, leading to runtime errors. They've applied...
See discussion in #294 and #305. The same change should be applied to `cub::DeviceReduce::Reduce`.
I have some slides for this, it's just a matter of embedding them into the docs and doing a writeup. ![NVBench Overview - Cold Measurement](https://user-images.githubusercontent.com/58744/115718961-a3bdd800-a349-11eb-8a50-974eaf7df3a8.png) ![NVBench Overview - Hot Measurement](https://user-images.githubusercontent.com/58744/115718966-a4566e80-a349-11eb-95ea-e36e0e6243f9.png)
Several users have asked about how to add custom arguments to their benchmarks (e.g. #86). This is done by implementing an application specific `main(argc, argv)` function and linking to the...