generic-username0718 comments

Results 16 comments of


                                            generic-username0718

trafficstars

Second device doesn't show LoRA loaded

hell, just load the LoRA on your phone and refresh the page... it bugs out...

generation attempts (for longer replies) clears on every generation attempt

fixed

[FIX] fail to load LoRA weights, fail to load LoRA weights in 4-bit, fail to generate text with LoRA in 8-bit, UnboundLocalError: local variable 'new_module' referenced before assignment, ValueError: We need an `offload_dir`, AttributeError: 'NoneType' object has no attribute 'device'

> > Did you manage to find a solution? > > Yes (but no). I tried to load in 8-bit mode: `python server.py --model llama-7b --lora alpaca --load-in-8bit` > >...

[FIX] fail to load LoRA weights, fail to load LoRA weights in 4-bit, fail to generate text with LoRA in 8-bit, UnboundLocalError: local variable 'new_module' referenced before assignment, ValueError: We need an `offload_dir`, AttributeError: 'NoneType' object has no attribute 'device'

I think I'm running into this bug https://github.com/huggingface/peft/issues/115#issuecomment-1460706852 Looks like I may need to modify PeftModel.from_pretrained or PeftModelForCausalLM but I'm not sure where...

[FIX] fail to load LoRA weights, fail to load LoRA weights in 4-bit, fail to generate text with LoRA in 8-bit, UnboundLocalError: local variable 'new_module' referenced before assignment, ValueError: We need an `offload_dir`, AttributeError: 'NoneType' object has no attribute 'device'

> For me/us, this fixed 8bit and 4bit with LoRA mode: [#332 (comment)](https://github.com/oobabooga/text-generation-webui/issues/332#issuecomment-1474883977) are you splitting the model in a multi-gpu setup?

[FIX] fail to load LoRA weights, fail to load LoRA weights in 4-bit, fail to generate text with LoRA in 8-bit, UnboundLocalError: local variable 'new_module' referenced before assignment, ValueError: We need an `offload_dir`, AttributeError: 'NoneType' object has no attribute 'device'

Yeah I think that's my problem... Looks like this guy may have done it... something about autocast? https://github.com/huggingface/peft/issues/115#issuecomment-1441016348 `with torch.cuda.amp.autocast():` `outputs = model.generate(input_ids=inputs['input_ids'], max_new_tokens=10)`

Add lora support?

Is there something I need to do to support multi-gpu configuration lora? ![image](https://user-images.githubusercontent.com/126929561/226198854-8d88c304-1a60-434a-bfe4-f54c190e3a23.png)

Support for LLaMA models

Awesome stuff. I'm able to load LLaMA-7b but trying to load LLaMA-13b crashes with the error: ``` Traceback (most recent call last): File "/home/user/Documents/oobabooga/text-generation-webui/server.py", line 189, in shared.model, shared.tokenizer =...

Support for LLaMA models

Anyone reading this you can get past the issue above by changing the world_size variable found in modules/LLaMA.py like this: def setup_model_parallel() -> Tuple[int, int]: local_rank = int(os.environ.get("LOCAL_RANK", -1)) **world_size...

Support for LLaMA models

Is there a parameter I need to pass to oobabooga to tell it to split the model among my two 3090 gpus?