Zhiyuan Li comments

Results 57 comments of


                                            Zhiyuan Li

[Model] Support Mamba

Thank you very much for your work, I want to implement the vllm feature for RWKV, your implementation gives me a great reference. Can I work with you to bring...

[Model] Support Mamba

> Hey @uniartisan, sure I can work with you to implement support for RWKV. Perhaps you could you start a draft PR, and we can talk there. Could you link...

[Backend]: Support device backend registration for a wide range of third-party hardware

Examples here: https://github.com/uniartisan/RWKV-PEFT/blob/device-enhance/train.py#L499 There are a lot of things to be checked, I will try to do it later and make it more clear in documentation

[Backend]: Support device backend registration for a wide range of third-party hardware

> hey @uniartisan are you willing to finish this one up? It would be a welcome contribution sorry for my late reply. I will solve them tomorrow! 😝😝😝😝

[Backend]: Support device backend registration for a wide range of third-party hardware

> will you add DirectML? What I've added is the code for plug-in registration, which means you can register Direct ML by yourself. Just write a few simple function 🤓

[Backend]: Support device backend registration for a wide range of third-party hardware

@lantiga Hello, since pytorch 2.3, it has started to support Intel's SYCL post-computing device, alias torch.xpu, so I wonder if we can promote the landing of this PR. Is there...

CI build has failed for a long time https://github.com/triton-lang/triton/actions/workflows/wheels.yml ```shell pip uninstall triton torch -y pip install -U --pre torch --index-url https://download.pytorch.org/whl/nightly/cu126 pip uninstall triton -y pip install -U triton-nightly...

关于RUN_CUDA_RWKV6这部分，最好用pytorch实现，否则不方便移植

> 联名上书，希望官方大佬复现Pytorch版本，[https://github.com/TorchRWKV/rwkv-kit这个仓库复现的需要加载预训练模型的权重，希望官方复现一个能从头训练的pytorch版本，这样才能有正真和transformer对抗的基础生态](https://github.com/TorchRWKV/rwkv-kit%E8%BF%99%E4%B8%AA%E4%BB%93%E5%BA%93%E5%A4%8D%E7%8E%B0%E7%9A%84%E9%9C%80%E8%A6%81%E5%8A%A0%E8%BD%BD%E9%A2%84%E8%AE%AD%E7%BB%83%E6%A8%A1%E5%9E%8B%E7%9A%84%E6%9D%83%E9%87%8D%EF%BC%8C%E5%B8%8C%E6%9C%9B%E5%AE%98%E6%96%B9%E5%A4%8D%E7%8E%B0%E4%B8%80%E4%B8%AA%E8%83%BD%E4%BB%8E%E5%A4%B4%E8%AE%AD%E7%BB%83%E7%9A%84pytorch%E7%89%88%E6%9C%AC%EF%BC%8C%E8%BF%99%E6%A0%B7%E6%89%8D%E8%83%BD%E6%9C%89%E6%AD%A3%E7%9C%9F%E5%92%8Ctransformer%E5%AF%B9%E6%8A%97%E7%9A%84%E5%9F%BA%E7%A1%80%E7%94%9F%E6%80%81) hi, This repo is currently support by me, and I'm currently working with RWKV team. Therefore you could treat it as an official version. The whole model is...

关于RUN_CUDA_RWKV6这部分，最好用pytorch实现，否则不方便移植

> 看了下论文的方向，挺棒的，但是整个设计对实际想进一步研究的人非常不友好，因为想用这个框架的，都是希望移植到边缘端，可是核心代码，用的又是cuda实现的，移植起来非常麻烦，还要自己手动对齐，好像除了1代都是这么干的？我也去测试了demo，感觉对终止符的推荐也不是很好，建议这么好的理论框架，最好能够设计的更方便大家去实验，才有机会被真正落地用起来。仅供参考。 You can take consider of rwkv.cpp/llama.cpp, we also provide onnx and pure torch code. https://github.com/TorchRWKV/flash-linear-attention/blob/main/fla/ops/rwkv6/recurrent_naive.py

[RFC] Autotune should consider batch size and number of heads

I would consider this issue, but since token length is still changing during training and reasoning, autotune for tokenlength is still worth considering