RWKV-LM icon indicating copy to clipboard operation
RWKV-LM copied to clipboard

RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, sa...

Results 109 RWKV-LM issues
Sort by recently updated
recently updated
newest added

There are many users with AMD graphics cards that want to train this model in a GPU accelerated manner. Radeon Open Compute is AMD's equivalent to CUDA (the relevant component...

Would it be possible to train it or inference this model on a Mac M1 Ultra 128 GB RAM system ?

Added this table, let me know if it's useful :) ![image](https://user-images.githubusercontent.com/287189/212363310-023a9e29-3de6-4e49-8f17-30979de28d5f.png)

Hey @BlinkDL! Awesome project! I was wondering if you have performed any Seq-2-Seq experiments with it? Any reason for going with GPT model in the first place as opposed to...

Amazing project, it would be awesome if all the information is in a research paper, easier to read. Thanks

Amazing work! But I'm really confused about the implementation details for wkv cuda kernel (in RWKV-LM\RWKV-v4neo\cuda\wkv_cuda.cu). How does the implementation match the equations shown in README? Could you please give...

Hi! The machine learning (ML) community is progressing at a remarkable pace and is embracing new techniques very quickly. Based on my comprehension of this model, it appears to offer...

As I understand, all models in RWKV family were trained on Pile dataset. I'm concerned with possible lack of preprocessing of the dataset. [Pile paper](https://arxiv.org/pdf/2101.00027.pdf) states `To avoid leakage of...

[Alpaca](https://github.com/tatsu-lab/stanford_alpaca) released their dataset on instruction tuning which is used on many other LLMs, any plan to finetune the RWKV on similar data? Here are a source of Chinse alpaca:[https://github.com/LC1332/Chinese-alpaca-lora](url)

It would be interesting to see if the new paper from Microsoft (https://arxiv.org/pdf/2303.07295.pdf) would have the same positive impact for RWKV. I don't see why not. Is this something in...