Alex "mcmonkey" Goodwin
Alex "mcmonkey" Goodwin
Had the same error, redownloaded from decapoda, it's fine now. Note that decapoda pushed an update ~15 days ago so if you downloaded before then, you have outdated weights and...
Ooo! I'm midway through making a LoRA Trainer PR for different features but I'll get to testing that ASAP. That looks to be, uh, actually possible to install and load...
Actually, limitation here will be that it has its entirely own separate model loading/formats/etc it seems, it doesn't just use the HuggingFace stuff nor the GPTQ stuff.
Grab the PR @ https://github.com/oobabooga/text-generation-webui/pull/1098 and run `pip install git+https://github.com/huggingface/peft` to get updated peft (both of these will be happening on main branch soon, waiting on ooba to merge the...
Despite the coincidence in naming `monkey-patch` does not actually relate to me lol, other than just I'm one of the people who's excited to use it because I like 4-bit...
At a glance, the perplexity code looks to be about right in terms of logic (haven't verified the small details). One thing you might add is to use the streaming...
I'm running as not-root on a Ubuntu install without issue. I have seen this error before, but it's usually resolved by closing my terminal window and opening a new one,...
Now the question is whether LLaVA or MiniGPT4 #1312 is better
Actual numbers since I know some people will want them. Tested on an RTX 3090, on Linux, using ~1.3 GiB while nothing is loaded. VRAM (Inference): 800 tokens input, generating...
@janvarev `--auto-devices` / `--gpu-memory` don't work with 4bit, you have to use `--pre_layer` ... and seemingly neither works _at all_ with `--monkey-patch`. Like it doesn't even process the pre_layer at...