cuda-api-wrappers
cuda-api-wrappers copied to clipboard
Thin, unified, C++-flavored wrappers for the CUDA APIs
We currently user a builder-ish copy parameters class, which inherits the actual raw copy parameters. But - it involves a lot of code duplication, use of conditional expressions etc. Perhaps...
It would be nicer to let our users build arrays not just with descriptors they provide, but in a proper builder pattern. Now, we would want a builder to also...
`cuCtxGetSharedMemConfig()` is now deprecated, likely because devices supporting it are no longer officially supported in CUDA. So, let's hide all parts of the code involving this function when building with...
CUDA 12.3 added support for specifying properties for graph nodes - when... * Adding edges between existing nodes * Removing edges between existing nodes * Adding a new node *...
Trouble building with MSVC was reported in #664 , and the cause seems to be a gratuituous constructor of "poor man's optional": ``` poor_mans_optional &operator=(const T &&value) noexcept(::std::is_nothrow_move_assignable::value) ``` I'm...
I've asked for a code review of my unique_span class, on codereviews.stackexchange.com, and [got one](https://codereview.stackexchange.com/a/293197/64513). Issues brought up: 1. Can simplify `swap()` 2. No need to delete copy ctors other...
NVIDIA offers an API for DMA access directly from the GPU - with async work scheduled via CUDA streams: https://docs.nvidia.com/gpudirect-storage/api-reference-guide/index.html it seems that could integrate nicely with the APIs we...
Let's allow beginning and ending stream capture via stand-alone functions as well as the `begin_capture()` and `end_capture()` methods.
We currently make a bunch of assumptions regarding texture views, and let the user handle them the "raw" CUDA way. It would be nice if, instead, we could offer a...
Let's allow `context.launch()` to do synchronous/default-stream launching of kernels, on any context object.