Jack Kosaian

Results 62 comments of Jack Kosaian

> Is it possible to construct kernel-required tma within kernel source behind NVRTC, so that host code can normally launch parameters without recomputing tma data-structures, e.g. kernelLaunch(.. , &dptr_A) ~~This...

I misspoke in the comment above. It is possible to construct Tensor Maps on device. [Here ](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html?highlight=Encoding%2520a%2520tensor%2520map%2520on%2520device#encoding-a-tensor-map-on-device) is some documentation on this. However, this has not been experimented with in...