Yujia Zhai
Results
2
repositories owned by
Yujia Zhai
Optimizing-SGEMM-on-NVIDIA-Turing-GPUs
237
Stars
41
Forks
Watchers
Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.
Optimizing-DGEMM-on-Intel-CPUs-with-AVX512F
89
Stars
18
Forks
Watchers
Stepwise optimizations of DGEMM on CPU, reaching performance faster than Intel MKL eventually, even under multithreading.