cuda-api-wrappers
cuda-api-wrappers copied to clipboard
Thin, unified, C++-flavored wrappers for the CUDA APIs
With CUDA 11.2, NVRTC added some API functions for determining the target architectures it supports. Let's add support for that.
Somehow, NVIDIA's separate library for compiling PTX code into SASS escaped me... It's documented at: https://docs.nvidia.com/cuda/ptx-compiler-api/index.html and we should definitely add support for it. There's a "handle" type, similar to...
While the driver wrappers branch has come a long way, it still uses some CUDA runtime API constants, types, and API calls. Some of this might be unavoidable, but many...
There could always be additional NVRTC compilation options which are not explicitly supported. Let's make it possible to add those with no special parsing/combiantion/etc. - to simply be appended to...
With NVRTC, you can choose to either suppress, warn, or emit an error when encountering various issues in the code, using: ``` --diag-error= --diag-suppress= --diag-warn= ``` this is currently not...
Look at our modified vectorAdd example. It's certainly nicer than the original, but it's just sad that we have to repeat ourselves again and again with respect to lengths and...
Some API calls may return `cudaCpuDeviceId` to indicate host memory as a location, or `cudaInvalidDeviceId` to indicate no single location. Right now, we are completely oblivious to these values -...
Several years ago, the NVIDIA Technical blog / parallel-4-all published this piece: [Separate Compilation and Linking of CUDA C++ Device Code](https://developer.nvidia.com/blog/separate-compilation-linking-cuda-device-code/) and linked to an example repository: [separate-compilation-linking](https://github.com/NVIDIA-developer-blog/code-samples/tree/master/posts/separate-compilation-linking). It would...
Now that we support CUDA arrays, and do some matter-of-fact dealing with pitched CUDA Runtime API calls, it's probably time we properly expanded that to pitched memory support. Pitched memory...
Added with CUDA 10.0, [cuda graphs](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#cuda-graphs) (though really terribly named as there are 7 "graph" things that cuda provides), is a take on task graphs within CUDA's programming model. What's...