RWKV-LM issues

RWKV-v7 training

1

I am testing RWKV-v7 0.4B for training but it seems not working like what I expected. How many memory do you use for this model or other 1.5B or 3B...

wonkyoc

Can someone put the pictures in a subdirectories, otherwise they look a bit messy

FangSen9000

RWKV-7中的代码，加载huggingface中的预训练模型报错

3

在第一个block会多一个attn（"blocks.0.att.v1", "blocks.0.att.v2", "blocks.0.att.v0".）,其它层正常 ![Image](https://github.com/user-attachments/assets/1dc2d72e-2027-4a6e-b7a1-c2f98db47753) ![Image](https://github.com/user-attachments/assets/6338cd61-0ce1-4d8c-9d11-55610c73a03b)

uniquehou

RWKV4 cuda

1

Hello, I am trying to use the RWKV4 model to process a sequential pkl dataset. However, when I use the CUDA kernel, I encounter an error'UnicodeDecodeError: 'gbk' codec can't decode...

dyx0209

Question about Pretraining

1

Hello! I wonder if RWKV7 used the sequence packing strategy during pre-training? If so, do the samples need to be masked from each other?

necrophagists

为什么RWKVv6的GPT模式不显式传递state

3

代码中的state通过cuda文件内生成，请问为什么不需要显示存储state？

KompressorSC

[ennhance]: use fused pytorch ops

Using PyTorch's built-in fused operators, which internally utilize fp32 for forward computation, improves both speed and accuracy.

zhiyuan1i

How to apply GPRO

1

How to apply GPRO methods to further training of the rwkv model

raymondzml

rwkw-r1?

1

is there any plans of releasing a reasoner model?

Kreijstal

Fix missing return in getitem for uint16 in dataset.py

The __getitem__ method did not return any value when args.data_type == "uint16", causing data loaders to receive None. Added an explicit `return x, y` to match the behavior of other...

jackhurwitz

RWKV-LM
RWKV-LM copied to clipboard

Metadata

RWKV-v7 training

Can someone put the pictures in a subdirectories, otherwise they look a bit messy

RWKV-7中的代码，加载huggingface中的预训练模型报错

RWKV4 cuda

Question about Pretraining

为什么RWKVv6的GPT模式不显式传递state

[ennhance]: use fused pytorch ops

How to apply GPRO

rwkw-r1?

Fix missing return in getitem for uint16 in dataset.py

← Metadata

Owner

Metadata

RWKV-LM RWKV-LM copied to clipboard

Metadata

← Metadata

Owner

Metadata

RWKV-LM
RWKV-LM copied to clipboard