Eyal Rozenberg
Eyal Rozenberg
While the driver wrappers branch has come a long way, it still uses some CUDA runtime API constants, types, and API calls. Some of this might be unavoidable, but many...
There could always be additional NVRTC compilation options which are not explicitly supported. Let's make it possible to add those with no special parsing/combiantion/etc. - to simply be appended to...
With NVRTC, you can choose to either suppress, warn, or emit an error when encountering various issues in the code, using: ``` --diag-error= --diag-suppress= --diag-warn= ``` this is currently not...
Look at our modified vectorAdd example. It's certainly nicer than the original, but it's just sad that we have to repeat ourselves again and again with respect to lengths and...
Some API calls may return `cudaCpuDeviceId` to indicate host memory as a location, or `cudaInvalidDeviceId` to indicate no single location. Right now, we are completely oblivious to these values -...
Several years ago, the NVIDIA Technical blog / parallel-4-all published this piece: [Separate Compilation and Linking of CUDA C++ Device Code](https://developer.nvidia.com/blog/separate-compilation-linking-cuda-device-code/) and linked to an example repository: [separate-compilation-linking](https://github.com/NVIDIA-developer-blog/code-samples/tree/master/posts/separate-compilation-linking). It would...
Now that we support CUDA arrays, and do some matter-of-fact dealing with pitched CUDA Runtime API calls, it's probably time we properly expanded that to pitched memory support. Pitched memory...
**[NVIDIA/gdrcopy](https://github.com/NVIDIA/gdrcopy)**: > **A low-latency GPU memory copy library based on NVIDIA GPUDirect RDMA technology. Introduction** > >While GPUDirect RDMA is meant for direct access to GPU memory from third-party devices,...
As we all know (or should know), the C++ standard's smart pointers class suck. Why? Because ownership should be of _regions_, not pointers, and it's inane to expect allocators or...
A lot of the wrapper code is located by now within `detail::` sub-namespaces, interspersed among the actual, intended-for-use, functions. Additionally, a lot of the implementations of non-`detail::` functions are already...