Yujia Zhai
Results
2
repositories owned by
Yujia Zhai
Optimizing-SGEMM-on-NVIDIA-Turing-GPUs
264
Stars
43
Forks
Watchers
Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.
Optimizing-DGEMM-on-Intel-CPUs-with-AVX512F
102
Stars
19
Forks
Watchers
Stepwise optimizations of DGEMM on CPU, reaching performance faster than Intel MKL eventually, even under multithreading.