FastChat
FastChat copied to clipboard

Published 20 hours ago •

Reame
Issues

Add LoraAdapter to model_adapter.py

Open WisartArfun opened this issue 1 year ago • 1 comments

Add LoRA adapter, so that a finetuned adapter doesn't need to be merged to model every time, but instead be loaded directly.

Why are these changes needed?

Loading Adapters directly allows for many different finetuned versions with little storage space requirements and easier infrastructure management.

Checks

I tested it with Vicuna7B-v1.1 and an adapter finetuned using alpaca-lora

May 25 '23 12:05 WisartArfun

Improve by adding by allowing to pass arguments, such as --load-8bit to base model.
Add support for many adapters sharing single base model on same GPU for efficient VRAM usage => easily compare many finetuned versions

May 25 '23 20:05 WisartArfun

closed by https://github.com/lm-sys/FastChat/pull/1807

Jun 29 '23 06:06 merrymercy