bartman081523 comments

Results 38 comments of


                                            bartman081523

4-bit: "Allocator: not enough memory: you tried to allocate 35389440 bytes."

> This is the output I get from conda list -p "C:\Users\patri\miniconda3\envs\textgen": Your env looks alright, as far as i can tell. > Wait - there is something wrong here....

[FIX] fail to load LoRA weights, fail to load LoRA weights in 4-bit, fail to generate text with LoRA in 8-bit, UnboundLocalError: local variable 'new_module' referenced before assignment, ValueError: We need an `offload_dir`, AttributeError: 'NoneType' object has no attribute 'device'

I found this maybe relevant: https://github.com/huggingface/peft/blob/main/examples/causal_language_modeling/peft_lora_clm_accelerate_big_model_inference.ipynb ``` from peft import PeftModel, PeftConfig max_memory = {0: "1GIB", 1: "1GIB", 2: "2GIB", 3: "10GIB", "cpu": "30GB"} peft_model_id = "smangrul/twitter_complaints_bigscience_bloomz-7b1_LORA_CAUSAL_LM" config = PeftConfig.from_pretrained(peft_model_id)...

[FIX] fail to load LoRA weights, fail to load LoRA weights in 4-bit, fail to generate text with LoRA in 8-bit, UnboundLocalError: local variable 'new_module' referenced before assignment, ValueError: We need an `offload_dir`, AttributeError: 'NoneType' object has no attribute 'device'

I tried without the "--gptq-bits 4", that failed with another error: ``` python server.py --model llama-7b --lora alpaca --listen --gpu-memory 11 --cpu-memory 16 --disk ===================================BUG REPORT=================================== Welcome to bitsandbytes. For...

[FIX] fail to load LoRA weights, fail to load LoRA weights in 4-bit, fail to generate text with LoRA in 8-bit, UnboundLocalError: local variable 'new_module' referenced before assignment, ValueError: We need an `offload_dir`, AttributeError: 'NoneType' object has no attribute 'device'

> Did you manage to find a solution? Yes (but no). I tried to load in 8-bit mode: `python server.py --model llama-7b --lora alpaca --load-in-8bit` In my opinion, this is...

[FIX] fail to load LoRA weights, fail to load LoRA weights in 4-bit, fail to generate text with LoRA in 8-bit, UnboundLocalError: local variable 'new_module' referenced before assignment, ValueError: We need an `offload_dir`, AttributeError: 'NoneType' object has no attribute 'device'

> Did you manage to find a solution? I found a way to load a chat finetuned model, although it is not alpaca, it is still very good. ``` cd...

[FIX] fail to load LoRA weights, fail to load LoRA weights in 4-bit, fail to generate text with LoRA in 8-bit, UnboundLocalError: local variable 'new_module' referenced before assignment, ValueError: We need an `offload_dir`, AttributeError: 'NoneType' object has no attribute 'device'

@wywywywy @BadisG found a way to fix 4-bit mode: https://github.com/oobabooga/text-generation-webui/issues/332#issuecomment-1474883977 change the `lora.py` from the `peft` package: `C:\Users\Utilisateur\anaconda3\envs\textgen\lib\site-packages\peft\tuners\lora.py` in Linux `venv/lib/python3.10/site-packages/peft/tuners/lora.py` fixed `lora.py` https://pastebin.com/eUWZsirk @BadisG added those 2 instructions on...

[FIX] fail to load LoRA weights, fail to load LoRA weights in 4-bit, fail to generate text with LoRA in 8-bit, UnboundLocalError: local variable 'new_module' referenced before assignment, ValueError: We need an `offload_dir`, AttributeError: 'NoneType' object has no attribute 'device'

> Good fix thank you. It worked. And i thank @BadisG > > But I wonder why not everybody faces the same problem? Other people can GPTQ 4bit without modifying...

[FIX] fail to load LoRA weights, fail to load LoRA weights in 4-bit, fail to generate text with LoRA in 8-bit, UnboundLocalError: local variable 'new_module' referenced before assignment, ValueError: We need an `offload_dir`, AttributeError: 'NoneType' object has no attribute 'device'

> maybe this information needs to be in a pull request, as its difficult to find. I agree and the patch is at this time for the peft module, not...

[FIX] fail to load LoRA weights, fail to load LoRA weights in 4-bit, fail to generate text with LoRA in 8-bit, UnboundLocalError: local variable 'new_module' referenced before assignment, ValueError: We need an `offload_dir`, AttributeError: 'NoneType' object has no attribute 'device'

> are you splitting the model in a multi-gpu setup? no.

[FIX] fail to load LoRA weights, fail to load LoRA weights in 4-bit, fail to generate text with LoRA in 8-bit, UnboundLocalError: local variable 'new_module' referenced before assignment, ValueError: We need an `offload_dir`, AttributeError: 'NoneType' object has no attribute 'device'

with `git reset --hard` and `git pull` (update) and the below peft fix, it is now possible to load LoRA models in 4-bit or 8-bit with `--gptq-bits 4` or `--load-in-8bit`...