cuda-core topic
List
cuda-core repositories
cuda_hgemv
48
Stars
4
Forks
Watchers
Several optimization methods of half-precision general matrix vector multiplication (HGEMV) using CUDA core.
decoding_attention
17
Stars
1
Forks
Watchers
Decoding Attention is specially optimized for multi head attention (MHA) using CUDA core for the decoding stage of LLM inference.