peft the output only hava adapter_config.json,adapter

i use lora to fintune model , the final output do not have checkpoint ,only hava adapter_config.json,adapter_model.bin

Apr 14 '23 03:04 zxyscz

it is right， only two files.

Apr 14 '23 05:04 shibing624

Hello, that is the expected output as the final checkpoints with PEFT methods are tiny. You can load them via PeftModel.from_pretrained(base_model, peft_model_name_or_path)

Apr 14 '23 06:04 pacman100

Hi @pacman100 , could you explain why the code is structured such that you must provide the base_model? It seems to me that the base_model is already present in the adapter_config.json and thus we should be able to call PeftModel.from_pretrained(peft_model_name_or_path) and the base_model should be loaded internally. Ideally, we can even call AutoModel.from_pretrained(peft_model_name_or_path) and the user is oblivious as to whether the underlying weights are coming from a standard or PEFT model.

Apr 16 '23 17:04 vincentmin

Is there a way where one can save the base model and the adapter merged into one checkpoint?

I think I found my answer:

https://github.com/tloen/alpaca-lora/blob/main/export_state_dict_checkpoint.py

Apr 20 '23 01:04 clxyder

@clxyder, with the latest main branch, you can simply do model = model.merge_and_unload() to get the base model with lora weights merged into it.

Apr 20 '23 09:04 pacman100

@clxyder, with the latest main branch, you can simply do model = model.merge_and_unload() to get the base model with lora weights merged into it.

Thank you for the answer! Is there a way I can update or set the state_dict without reloading it again?

Apr 21 '23 03:04 clxyder

What would it take to support GPT2 models in merge_and_unload?

Getting the error: "GPT2 models are not supported for merging LORA layers"

Apr 22 '23 16:04 gschoeni

@clxyder, with the latest main branch, you can simply do model = model.merge_and_unload() to get the base model with lora weights merged into it.

@pacman100 @younesbelkada Hi, thanks for this cool library! I'm new to peft. The model.merge_and_unload() method looks like magic to me. Could you give a quick introduction about model.merge_and_unload()？

In my view, LoRA adds new trainable parameters/layers and inserts these layers into the base model, that is the LoRA model has additional structures on top of the base model. And we can save the merge_and_unload model and reload it with base_model.from_pretrained(unloaded_model_path) interface. But where are the additional layers and parameters?

Apr 26 '23 03:04 Opdoop

Hi @Opdoop ,

Thanks for raising up this! In a nutshell this is a diagram that explains how the merging works under the hood:

So during training, you have these two independent modules (A & B) that are trainable. In that scenario, the output hidden states h can be computed as:

h = (Wx + BAx) + b

During training as you only want to update A & B you can't simplify the mathematical expression and run the computation in a single matrix multiplication.

However once A & B have been trained, you can "merge" these weights by simply adding them to W as follows`:

W_merged = (W + BA)

since

W_merged x + b = (Wx + BAx) + b

The merged model is totally equivalent as the un-merged model, but this time you only need a single weight, W_merged

Let us know if anything else is unclear

Apr 27 '23 11:04 younesbelkada

@younesbelkada Cool! A big thanks to you! This explanation solved my confusion in a very simple and concrete way. Thanks for this beautiful diagram and math explanation! 🌹🌹🌹

Apr 27 '23 12:04 Opdoop

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

May 21 '23 15:05 github-actions[bot]

peft
peft copied to clipboard

the output only hava adapter_config.json,adapter_model.bin

peft peft copied to clipboard

the output only hava adapter_config.json,adapter_model.bin

peft
peft copied to clipboard