PENG Bo comments

Results 265 comments of


                                            PENG Bo

Can we get RWKV model family support?

> @BlinkDL You said earlier `ChatRWKV v2: with "stream" and "split" strategies. 3G VRAM is enough to run RWKV 14B` Yet @oobabooga said it went OOM on a 3090 (24GB...

Can we get RWKV model family support?

> If it replies faster/better than a regular 13b even with the split, it's still something. Plus the faster time to train. But I guess miracles we will not get....

Can we get RWKV model family support?

> @BlinkDL but its 24gb not 3gb, i really wanned to run that on 3080ti which only has 12gbs "cuda fp16 *12 -> cpu fp32" [try increasing 12, for better...

Can we get RWKV model family support?

Hi :) As I said before, [try increasing 30, for better speed, until you run out of VRAM]. @Ph0rk0z Increase "30" in cuda fp16 *30 to compute more layers on...

Can we get RWKV model family support?

Moreover, set os.environ["RWKV_CUDA_ON"] = '1' in https://github.com/oobabooga/text-generation-webui/blob/main/modules/RWKV.py for 10x speedup of reply time

Can we get RWKV model family support?

It's purely pytorch issue because the CPU utilization is fine for most Intel CPUs and AMD server CPUs. I will ask pytorch guys. Please see whether "cuda fp16 *29+" will...

Can we get RWKV model family support?

> All in all this model handles 4096 context good enough. Maybe the limits should be raised. RWKV-ctx4096 models can handle ctx4k :) The difference between cuda and non-cuda is...

Run Single Model on Multiple GPUs

Great work :) My idea is to keep the main ChatRWKV repository simple (easy for everyone to learn its code), while having some community forks with cutting-edge functions. If you...

Run Single Model on Multiple GPUs

> Nothing offensive, but honestly I found the `global` variables are abused in the current code base. It may be a challenge for new comers to read and understand the...

Run Single Model on Multiple GPUs

Now ChatRWKV v2 supports this too :)