wannier_tools
wannier_tools copied to clipboard
Another attempt to offload computation to cuBlas.
We have tried to offload the mat_mul to gpu using cuBlas and c interface, and gain at least 2x speedup in the SlabSS_calc task with A100. This idea is still very preliminary, but it is compatible with exist mpi parallel. This project serves as a catalyst, hoping to inspire more people to optimize and refine this project. https://github.com/pkusc/wannier_tools_cuda