Allard Hendriksen
Allard Hendriksen
Removing `cuda::ptx::mapa` again due to bug. See #1442 #1414
Hi, Thanks for reporting this issue! It sure looks complicated. I will try to look into it, but I might not be able to help you.. If I understand correctly,...
Hi, Thanks for the clarifying comment. I believe I understand the problem now :) I tried to reproduce the problem for myself by first creating a socket forwarded socket: ```...
For the shared library, I would recommend defining `CUVS_EXPLICIT_INSTANTIATE_ONLY `. This will catch unintended template instantiations at compile time and avoid compile time and binary sizes explosion.
Please don't merge yet. I still have to incorporate some internal feedback.
In the mean time, the code example for on-device tensor map modification has made it into the CUDA programming guide. Instead of duplicating the documentation and code sample, I have...
/ok to test
/ok to test
The batch timings are often ignored, see also #194 . Perhaps it is possible to introduce an abstraction that makes it easy to always grab the "best" timing, depending on...