cutlass topic
List
cutlass repositories
ConvolutionBuildingBlocks
20
Stars
3
Forks
Watchers
GEMM and Winograd based convolutions using CUTLASS
flash_attention_inference
20
Stars
2
Forks
Watchers
Performance of the C++ interface of flash attention and flash attention v2 in large language model (LLM) inference scenarios.