cuda-api-wrappers
cuda-api-wrappers copied to clipboard
Thin, unified, C++-flavored wrappers for the CUDA APIs
Let's not force making an extra copy of the log, allowing the user to pass in a buffer reference somehow.
CUDA 11.2 introduced a "memory pool" mechanism; we should support it: Full API documentation [here](https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__MEMORY__POOLS.html). ```lang-cpp cudaError_t cudaMallocFromPoolAsync ( void** ptr, size_t size, cudaMemPool_t memPool, cudaStream_t stream ) cudaError_t cudaMemPoolCreate...
CUDA 11.2 added asynchronous memory allocation and de-allocation. Let's support that. API description: [here](https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__MEMORY__POOLS.html). ``` cudaError_t cudaFreeAsync ( void* devPtr, cudaStream_t hStream ); cudaError_t cudaMallocAsync ( void** devPtr, size_t size,...
NVIDIA's [jitify](https://github.com/NVIDIA/jitify) library provides a C++'ish interface to (some of? all of?) the real-time compilation / JIT compilation facilities nVIDIA provides. This library should provide this functionality, in particular; and...
When we render compilation options into a string, it ends with an extra space. To fix this, we'll probably need to write startout instead of endopt, and have that have...
It would be useful if one could build NVRTC programs incrementally, adding and setting headers, options, etc. at one's convenience rather than when constructing a `program_t` object.
It seems `nvrtcGetProgramLogSize()` includes 1 for a trailing '\0' character, and so we end up placing it in our return value - which is not a C-style string. Let's not...
`nvrtc.h` should be included in `nvrtc/types.hpp`, not `nvrtc/error.hpp`.
Provide instalation guide for installing with CMakeLists.
The nvrtc library (target) should depend on the runtime-and-driver (target), even if they're just header-only and it doesn't matter much.