RWKV-LM issues

Results 109 RWKV-LM issues

Sort by recently updated

Hi, I was wondering whether this model can achieve [GPT-4](https://openai.com/research/gpt-4) level performance on the HumanEval benchmark, a proxy for effectiveness at code generation. I'm fine if I have to train...

philipturner

Performance optimizations

Reduced unnecessary copying in the code by optimizing the slicing and appending operations. These changes should result in improved performance.

guangyusong

feat: configure pre-commit for better project

better project with following style guide from [PEP8 python code style](https://peps.python.org/pep-0008/), so i create some formatter configuration with [precommit](https://pre-commit.com/). for sample configuration can use: ``` repos: - repo: https://github.com/pre-commit/pre-commit-hooks rev:...

slowy07

Update README.md

HuggingFace -> Hugging Face

eltociear

如何给RWKV添加其他的神经网络

比如给他加一个GCN网络

dataangel

解决加载时的内存占用问题

因为linux有页缓存，所以我在wsl2启动的时候load模型文件需要两倍于模型文件大小的内存，我这里有一个简单的办法解决了这个问题，就是在读取后立即告诉操作系统释放对应的内存 ```python def file_cleaner(file): last_pos = 0 def cleaner(): nonlocal last_pos print("cleaner start") while True: time.sleep(0.1) pos = file.tell() if pos > last_pos: print("cleaner clean %d to %d" % (last_pos,pos))...

myhyh

About high cuda memory allocation with too long context length

In training process, I have noticed that the cuda code finished all calculation within the ctx-len, the speed is fast but seems memory-unfriendly for some application with long context length....

lantudou

Create RWKV language model from config, not loading from file, without CUDA

I saw some code under [RWKV-LM/RWKV-v4neo/src/model.py](https://github.com/BlinkDL/RWKV-LM/blob/main/RWKV-v4neo/src/model.py) which requires CUDA to create RWKV model. I want to change the code by replacing the first embedding layer with a linear layer to...

James4Ever0

Question about the normalizer in cuda kernel

Hi, I have one tiny question about the cuda kernel. In the code, `aa` and `bb` are running sums. To avoid overflow, you divided `exp(-p)` both when computing `y[ii]` and...

typoverflow

RWKV-LM
RWKV-LM copied to clipboard

Metadata

remove A in code example

HumanEval benchmarks?

Performance optimizations

feat: configure pre-commit for better project

Update README.md

如何给RWKV添加其他的神经网络

解决加载时的内存占用问题

About high cuda memory allocation with too long context length

Create RWKV language model from config, not loading from file, without CUDA

Question about the normalizer in cuda kernel

← Metadata

Owner

Metadata

RWKV-LM RWKV-LM copied to clipboard

Metadata

← Metadata

Owner

Metadata

RWKV-LM
RWKV-LM copied to clipboard