cuda-api-wrappers
cuda-api-wrappers copied to clipboard
Thin, unified, C++-flavored wrappers for the CUDA APIs
It is easy to confuse non-mangled with mangled names. Let's enforce a hard separation of them using a (non-owning) class.
Our implementation of optional for pre-C++17 has a bug: `optional::value_or()` doesn't return a value :-(
CUDA modules can be loaded either eagerly or lazily (and this is controlled by an environment variable). It's a global setting, which can be queried using [`cuModuleGetLoadingMode()`](https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__MODULE.html): ``` CUresult cuModuleGetLoadingMode...
We currently define a macro named `CAN_GET_APRIORI_KERNEL_HANDLE` in our headers. While that's not a very likely name to clash, let's be nice and prefix that with `CAW_`, to be extra...
We have an array in `simpleIPC.cu` which, if no Unified Virtual Addressing devices are present, will not have its first element initialized before that element is accessed. But it's accessed...
Beginning with CUDA 12.3, we have `cuFuncGetName()` and `cuFuncGetModule()`, so we can obtain a `cuda::kernel_t`'s name and its containing `module_t` :-)
In the simpleCudaGraphs example, in function cudaGraphsManual, we have: ``` cuda::graph::node::parameters_t* params_ptr = nullptr; cuda::graph::instance::set_node_parameters(instance, reduce_final_node, *params_ptr); ``` that should not have been there... :-( Also, graph execution is currently...
Beginning with CUDA 12.4, we are able to "enumerate" the functions in a module, or more literally - get an array of all cuFunction's for all compiled functions in a...
Some methods of the `grid::dimensions_t` and `grid::composite_dimensions_t` classes, which can be marked `constexpr` and/or `noexcept`, are not this marked.
It might be convenient for some users to treat `grid::dimensions_t` and `grid::overall_dimensions_t` as arrays of dimensions for the different axes. So, let's allow this via a (const and non-const) `operator[]`.