FastChat icon indicating copy to clipboard operation
FastChat copied to clipboard

Add LoraAdapter to model_adapter.py

Open WisartArfun opened this issue 1 year ago • 1 comments

Add LoRA adapter, so that a finetuned adapter doesn't need to be merged to model every time, but instead be loaded directly.

Why are these changes needed?

Loading Adapters directly allows for many different finetuned versions with little storage space requirements and easier infrastructure management.

Checks

  • I tested it with Vicuna7B-v1.1 and an adapter finetuned using alpaca-lora

WisartArfun avatar May 25 '23 12:05 WisartArfun

  • Improve by adding by allowing to pass arguments, such as --load-8bit to base model.
  • Add support for many adapters sharing single base model on same GPU for efficient VRAM usage => easily compare many finetuned versions

WisartArfun avatar May 25 '23 20:05 WisartArfun

closed by https://github.com/lm-sys/FastChat/pull/1807

merrymercy avatar Jun 29 '23 06:06 merrymercy