Results 13 comments of DeMoriarty

I'd recommend you to look into [cutlass](https://github.com/NVIDIA/cutlass), which is open sourced and have reliable performance on varius gpu architectures.

thanks for pointing this out. changed it to `search` a while back but forgot to edit the README.

thanks for the suggestion! will work on it once I have time. but also welcome anyone who wants to do it. all contributions are appreciated. keeping this open as a...