RWKV-LM icon indicating copy to clipboard operation
RWKV-LM copied to clipboard

RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, sa...

Results 109 RWKV-LM issues
Sort by recently updated
recently updated
newest added

**** ​LoRA additionally training parameter time_mix_r LoRA training module blocks.38.ffn.key LoRA training module blocks.38.ffn.receptance LoRA training module blocks.38.ffn.value LoRA additionally training module blocks.39.ln1 LoRA additionally training module blocks.39.ln2 LoRA additionally...

Hi, Please support me how to train rwkv model (rwkv4neo) in cuda version 11.2 or 11.3. I can't install 11.7 or newer. Thanks.

Exception has occurred: IndexError index 57119 is out of bounds for dimension 0 with size 50277 File "C:\workspace\wenda\llms\llm_rwkv.py", line 345, in load_model out, state = model.forward(pipeline.encode(f'''{user}{interface} hi File "C:\workspace\wenda\wenda.py", line...

Dear Author, I wanted to reach out and extend my gratitude for creating this remarkable model. It has truly opened up new horizons in my exploration of Large Language Models....

I have posted this issue in Discord a week ago, but no one has yet replied, I don't know exactly what is happening. The point is that some mixing coefficients...

model是RWKV-4-World-0.1B-v1-20230520-ctx4096。 比如这种 ``` User:Generate a JSON file to describe an automation action. Assistant:[ To describe an automation action, you can use the `describe` method. Here's an example of how you...

基于world-chinese的1.5b的ckpt增量训练loss绝对值达到2.74,远大于RWKV pile图中332b的2.0的loss。想问问是怎么回事?

Complex rotary memory. Seems to give a boost to model learning and smaller test models appear to grok information easier

Hi, when following this instruction to run `RWKV-v4neo` on DDP, https://github.com/BlinkDL/RWKV-LM/blob/39a4d461a5102defd2a47f12b64b38466bf8ec4c/RWKV-v4neo/train.py#L23-L30 I got this error: ``` RuntimeError: expected scalar type BFloat16 but found Float ``` After digging a little bit...