cuda-core topic

List cuda-core repositories

cuda_hgemv

48
Stars
4
Forks
Watchers

Several optimization methods of half-precision general matrix vector multiplication (HGEMV) using CUDA core.

decoding_attention

17
Stars
1
Forks
Watchers

Decoding Attention is specially optimized for multi head attention (MHA) using CUDA core for the decoding stage of LLM inference.