text-generation-webui
text-generation-webui copied to clipboard
Add RWKV strategy to UI and ability to save with user model data
Description
When using an RWKV model, the loading strategy must be expressed on the command line, and is often model size-specific. It also can be a relatively complex argument. Add the ability to set this in the UI (i.e. before loading another RWKV model) and to save those parameters with the user model data in models/config-user.yaml.
Additional Context
I use two 12 GB RTX 3060 GPUs. To load a 7B RWKV model, I typically use something like: --rwkv-strategy "cuda:0 fp16 *17 -> cuda:1 fp16"
For squeezing a 14B model into those GPUs, I might use: --rwkv-strategy "cuda:0 fp16 *16 -> cuda:1 fp16 *8 -> cuda:1 fp16i8"
At a minimum, I'd like to be able to save that strategy argument value in the embedded model config file. Definitive examples are here: https://pypi.org/project/rwkv/ BTW the current RWKV page in this project may have some --rwkv-strategy examples that aren't syntactically valid, and could be updated/corrected.