cuda-api-wrappers
cuda-api-wrappers copied to clipboard
Thin, unified, C++-flavored wrappers for the CUDA APIs
Let's add the bandwidthtest CUDA sample program to our modified CUDA samples.
Hello, the testcase `vectorAdd_profiled` doesn't work from the box for me: ```sh ./vectorAdd_profiled terminate called after throwing an instance of 'cuda::runtime_error' what(): Starting CUDA profiling: initialization error ``` after adding...
Currently, scoped_existence_ensurer_t uses `context::current::detail_::get_handle()`, which assumes the driver has been initialized. To drop this assumption, we need to check the return status of the CUDA driver API call.
The launch config builder (#311 ) is awesome. Let's use it more! There are lots of examples which do this work themselves rather than just availing themselves of the builder.
We can query the device associated with a launch config builder object, but we're missing the method(s) for setting it.
The launch config builder accepts overall dimensions using `size_t` value. But - those may exceed what CUDA supports. So, we need to check the values are supported, at least in...
Beginning with CUDA 12.0, we now have access to several functions for handling "libraries" of context-less "kernels": https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__LIBRARY.html One can get a context-associated module or kernel by calling `cuKernelGetFunction()` or...
It seems like the link options of a module are not used anywhere after its creation. Well, let's drop them then.
`apriori_compiled_kernel_t` -> `kernel::apriori_compiled_t` makes more sense... let's move it there. Also, move some functions into a `kernel::apriori_compiled` sub-namespace.
Many of our the objects we wrap in the library have all sorts of "attributes" or "properties", with API functions for getting and setting them. At the moment, we reflect...