Zeyu Wang
Results
1
issues of
Zeyu Wang
The line 376 in the file "cudaTensorCoreGemm.cu" : "float *tile_ptr = shmem_warp_tile_ptr + i * SHMEM_STRIDE * K + j * N;" should be modified to "float *tile_ptr = shmem_warp_tile_ptr...
bug