Liyan Chen

Results 2 issues of Liyan Chen

Hello, I had the same issue as #8 part (2) when installing with not-latest PyTorch versions (1.8.1, 1.9.1, etc). Test environments: Ubuntu 20.04 + 18.04, python 3.7 + 3.8, GCC...

Hello, I'm curious if the implementation adopts the `ldmatrix` instruction for loading tiles from shared memory to registers. It seems the current version didn't implement `load()` with explicit `ldmatrix` per...