Yu Zhang
Yu Zhang
@ostix360 could you provide minimal reproducible code snippets? the inputs to `chunk_rwkv6`
@ostix360 Hello, sorry for late reply, just ran the examples, passed seamlessly. Could you share your triton version & hardware infos?
@ostix360 I'm not sure if 4070-ti would be ok. Does other kernels work for you, e.g., `chunk_gla`. If neither, I think triton 2.2 is not compatible with 4070-ti for current...
Does `chunk_rwkv6` work for you?
https://github.com/sustcsonglin/flash-linear-attention/blob/main/tests/ops/test_gla.py same level folder as `fla` ```sh $ pytest -s test_gla ``` for gla RWKV6 works similarly you can also simply run the checks ```sh $ python -m fla.ops.rwkv6.chunk_naive ```
@ostix360 Hello, could you check it again. Just pushed some commits to fix some issues of initial states.
@uniartisan Hello, many thanks for these great contributions! I will make some checks soon. However, could you restrict the revisions to the RWKV6 chunk only? You've defined many decorators for...
https://github.com/sustcsonglin/flash-linear-attention/blob/8dea8bdaa14eb1f2a06152691dcd238043811fe6/tests/ops/test_rwkv6.py This file seems broken
Also it is not recommended to truncate the spaces at the end of each line in README file, as they are sometimes used as line breaks.
@uniartisan Hi, just make some reviews, could you have a check?