Zhiyuan Li
Zhiyuan Li
Triton, Tilelang, CuteTile, CuteDSL, and CUDA C could be worth considering.
Thanks for contributing, can you @yiyousong add tests to your contribution? This will improve the robustness of the code @yzhangcs could you please give some comments? tests: https://github.com/fla-org/flash-linear-attention/blob/main/tests/ops/test_linear_attn.py layers: https://github.com/fla-org/flash-linear-attention/blob/main/fla/layers/linear_attn.py
> > Thanks for contributing, can you @yiyousong add tests to your contribution? This will improve the robustness of the code > > @yzhangcs could you please give some comments?...