Atharva Vaidya comments

Results 9 comments of


                                            Atharva Vaidya

[BUG] Optimized mode is broken

It's happening after commit fe17340 ![image](https://user-images.githubusercontent.com/93036639/187471246-fbc2c186-32c4-4d4e-a820-640fcc474323.png) I believe with optimized mode the model isn't getting transferred to the GPU at all now, hence the multiple devices error

[BUG] Optimized mode is broken

@oobabooga are you using the optimised switch?

[BUG] Optimized mode is broken

RTX 2060 Mobile 6GB

LLaMA model does not work after migrating to the main transformers

> If you do not use the option > > > --no-stream > > then > > > CUDA error: an illegal memory access was encountered I have the exact...

Add lora support?

> I can run it with cpu, but still get error with gpu `python server.py --listen --model llama-7b --load-in-8bit --lora alpaca-lora-7b Hi, did you find any solution for this? I'm...

[Feature Request]: PLEASE AUTOMATIC1111 add Lora

> Fwiw, there's already a working implementation in the v21 branch my dream booth extension. It should get merged into main today. I just tried it, but I can't seem...

Still getting an OOM error while saving model with LoRA(6 GB VRAM)

> Also disabling '-opt-channelslast' reduced frequency of OOM for me in addition to the above. I think also switching from xformers to flash_attention might save a bit more ram also...

Is split offloading supported for 8bit mode?

Thanks a lot for this! A note for anyone who gets `NameError: name 'BitsAndBytesConfig' is not defined`, use the second method, i.e. add `params.extend(["load_in_8bit=True", "llm_int8_enable_fp32_cpu_offload=True"]` below the pre-existing code (instead...

The model can't seem to keep track of a conversation.

> If you're willing to manually retype the conversation history, then you can get your question answered, like so: Thanks! I guess that'll do for now. Hoping that it is...