Allard Hendriksen

Results 29 comments of Allard Hendriksen

Removing `cuda::ptx::mapa` again due to bug. See #1442 #1414

Hi, Thanks for reporting this issue! It sure looks complicated. I will try to look into it, but I might not be able to help you.. If I understand correctly,...

Hi, Thanks for the clarifying comment. I believe I understand the problem now :) I tried to reproduce the problem for myself by first creating a socket forwarded socket: ```...

For the shared library, I would recommend defining `CUVS_EXPLICIT_INSTANTIATE_ONLY `. This will catch unintended template instantiations at compile time and avoid compile time and binary sizes explosion.

Please don't merge yet. I still have to incorporate some internal feedback.

In the mean time, the code example for on-device tensor map modification has made it into the CUDA programming guide. Instead of duplicating the documentation and code sample, I have...

The batch timings are often ignored, see also #194 . Perhaps it is possible to introduce an abstraction that makes it easy to always grab the "best" timing, depending on...