generic-username0718
generic-username0718
Sorry super dumb but do I pass this to start-webui.sh? Like `sh start-webui.sh --gpu-memory 10 5`?
Thanks friend! I was able to get it with `call python server.py --gpu-memory 20 20 --cai-chat`
> `--gpu-memory` should have no effect on LLaMA. This is for models loaded using the `from_pretrained` function from HF. > > For LLaMA, the correct way is to change the...
I think I'm running into this bug https://github.com/huggingface/peft/issues/115#issuecomment-1460706852 Looks like I may need to modify PeftModel.from_pretrained or PeftModelForCausalLM but I'm not sure where...
I think something is broken for int8 split-model lora right now... but not sure where to fix... I think this guy did it... https://github.com/huggingface/peft/issues/115#issuecomment-1441016348
I found a really hacky fix... I kept on running OOM as the model loads lopsided... so I made the following changes to the modules/LoRA.py file: 1) replace `params['device_map'] =...