cuda-api-wrappers icon indicating copy to clipboard operation
cuda-api-wrappers copied to clipboard

Thin, unified, C++-flavored wrappers for the CUDA APIs

Results 151 cuda-api-wrappers issues
Sort by recently updated
recently updated
newest added

We currently user a builder-ish copy parameters class, which inherits the actual raw copy parameters. But - it involves a lot of code duplication, use of conditional expressions etc. Perhaps...

question

It would be nicer to let our users build arrays not just with descriptors they provide, but in a proper builder pattern. Now, we would want a builder to also...

enhancement

`cuCtxGetSharedMemConfig()` is now deprecated, likely because devices supporting it are no longer officially supported in CUDA. So, let's hide all parts of the code involving this function when building with...

task
resolved-on-development

CUDA 12.3 added support for specifying properties for graph nodes - when... * Adding edges between existing nodes * Removing edges between existing nodes * Adding a new node *...

missing-cuda-feature

Trouble building with MSVC was reported in #664 , and the cause seems to be a gratuituous constructor of "poor man's optional": ``` poor_mans_optional &operator=(const T &&value) noexcept(::std::is_nothrow_move_assignable::value) ``` I'm...

resolved-on-development
ms-windows

I've asked for a code review of my unique_span class, on codereviews.stackexchange.com, and [got one](https://codereview.stackexchange.com/a/293197/64513). Issues brought up: 1. Can simplify `swap()` 2. No need to delete copy ctors other...

task
resolved-on-development

NVIDIA offers an API for DMA access directly from the GPU - with async work scheduled via CUDA streams: https://docs.nvidia.com/gpudirect-storage/api-reference-guide/index.html it seems that could integrate nicely with the APIs we...

missing-cuda-feature

Let's allow beginning and ending stream capture via stand-alone functions as well as the `begin_capture()` and `end_capture()` methods.

task
resolved-on-development

We currently make a bunch of assumptions regarding texture views, and let the user handle them the "raw" CUDA way. It would be nice if, instead, we could offer a...

enhancement

Let's allow `context.launch()` to do synchronous/default-stream launching of kernels, on any context object.

task
resolved-on-development