RWKV-LM issues

CUDA compilation error with Ctx Length>2000

8

Hello, I am trying out RWKV with audio modality and when I set T_MAX>>1000, it throws this error: ``` Emitting ninja build file /root/.cache/torch_extensions/py39_cu116/timex/build.ninja... Building extension module timex... Allowing ninja...

ojus1

关于调用模型做分类任务

2

你好作者！我对此工作很感兴趣，因为我现在在用基于transformer的模型做分类任务，transformer或者RNN在分类任务里通常采用最后一个模块的每个通道的最后一个元素作为输出，并通过全连接层映射到几个类别。请问你觉得RWKV原理类似吗？依旧提取最后一个元素作为输出是否稳妥呢？希望您能给出一些建议，我将很感激！

louisinhit

Can I train this model with smaller dataset with CPU?

1

Tursunali-Kholdorov

VRAM performance

1

Hi @BlinkDL! First off this is amazing and seems very promising for scaling down large Transformers to be more production friendly. I'm wondering if you have any benchmarks regarding VRAM...

cuuupid

RWKV-4 169m/430m in browser with ORT Web / TF.js / tfjs-tflite?

32

Hi, really exciting project! I'm wondering if you've published the model conversion script that you used to create the [js_models](https://github.com/BlinkDL/AI-Writer/tree/main/docs/eng/js_model) files from the `.pth` model file? It would be *awesome*...

josephrocca

Any paper about RWKV?

3

It is awesome and intereseting. I wonder if there is any paper about RWKV? Thanks.

shunyuzh

Other languages training plan

4

Is there a training plan for this project in other languages (e.g. Japanese)?

chinoll

Paper covering additional tokens idea

5

Hi there. You mention in the readme that you're interested in potentially adding some special tokens/markers to represent stuff like capitalisation. Just wanted to let you know we tried that...

jph00

Please help to shoot the error while loading locally trained model to chat.py

1

Hi, I was training the model locally from scratch. ```shell python train.py --load_model --wandb --proj_dir out --data_file ../data/enwik8 --data_type utf-8 --vocab_size 0 --ctx_len 512 --epoch_steps 5000 --epoch_count 500 --epoch_begin 0...

shwangdev

Access/train to use the embeddings

1

Hi @BlinkDL ! Really interested in your work here. I am looking to test out some of the models for embedding based tasks. What is the best way to access...

jn2clark

RWKV-LM
RWKV-LM copied to clipboard

Metadata

CUDA compilation error with Ctx Length>2000

关于调用模型做分类任务

Can I train this model with smaller dataset with CPU?

VRAM performance

RWKV-4 169m/430m in browser with ORT Web / TF.js / tfjs-tflite?

Any paper about RWKV?

Other languages training plan

Paper covering additional tokens idea

Please help to shoot the error while loading locally trained model to chat.py

Access/train to use the embeddings

← Metadata

Owner

Metadata

RWKV-LM RWKV-LM copied to clipboard

Metadata

← Metadata

Owner

Metadata

RWKV-LM
RWKV-LM copied to clipboard