cuda-core topic
List
cuda-core repositories
cuda_hgemv
48
Stars
4
Forks
Watchers
Several optimization methods of half-precision general matrix vector multiplication (HGEMV) using CUDA core.
decoding_attention
46
Stars
4
Forks
46
Watchers
Decoding Attention is specially optimized for MHA, MQA, GQA and MLA using CUDA core for the decoding stage of LLM inference.