cuda-api-wrappers
cuda-api-wrappers copied to clipboard
Thin, unified, C++-flavored wrappers for the CUDA APIs
In CUDA 10 (?), API calls were introduced to capture activity on a stream: ``` __host__ cudaError_t cudaStreamBeginCapture ( cudaStream_t stream, cudaStreamCaptureMode mode ); __host__ cudaError_t cudaStreamEndCapture ( cudaStream_t stream,...
Both the runtime and the driver API allow allocation of mipmapped arrays: Runtime: * `cudaMallocMipmappedArray()` * `cudaFreeMipmappedArray()` * `cudaGetMipmappedArrayLevel()` Driver: * `cuMipmappedArrayCreate()` * `cuMipmappedArrayDestroy()` * `cuMipmappedArrayGetLevel()` We should add support...
While we support "raw" allocation function calls and allocators for `unique_ptr`'s, we don't support the allocators the standard library containers might use. This should perhaps be added. An inspiration is...
CUDA offers a library named CUPTI - the [CUDA Profiling Tools Interface](https://developer.nvidia.com/CUPTI-CTK10_2): > CUPTI provides a set of APIs targeted at ISVs creating profilers and other performance optimization tools: >...
We currently don't have API wrappers for obtaining shared memory region attributes.
Should `cuda::device::count()` return an `std::size` or a `device::id_t`? `std::size_t` is semantically more accurate, i.e. it's the size of a set of devices. It's non-negative. And in C++ we use this...
We currently have more than a few wrappers taking a `void*` and a `num_bytes` or `size_in_bytes` etc. Instead, why not use [spans](https://stackoverflow.com/questions/45723819/what-is-a-span-and-when-should-i-use-one?noredirect=1&lq=1)? And if we do that, perhaps we could/should...
By mistake, the `examples/` directory has a committed skeleton of a test program for the IPC CUDA features via our IPC wrappers. The program currently does nothing, but has a...
The example programs double as sort-of-unit tests - as well as the simpler tests of compilation, as msot code does not get instantiated from templated when you just build the...
Almost all of the code is missing doxygen comments, first and foremost being the methods and the classes. Let's write it.