optimum-intel How to apply LoRA adapter to a model loaded with OVModelForCausalLM()?

How to apply LoRA adapter to a model loaded with OVModelForCausalLM()?

Open nai-kon opened this issue 10 months ago • 4 comments

In the transformers library, we can load multiple adapters to the original model by load_adapter then switch the specified adapter with set_adapter like below.

# base model
model = AutoModelForCausalLM.from_pretrained(
    model_name,
)

# load multiple adapters
model.load_adapter("model/adapter1/", "adapter1")
model.load_adapter("model/adapter2/", "adapter2")

# switch adapter
model.set_adapter("adapter2")

Now I want to apply LoRA adapters with OpenVINO, but I can't find an example of it. Is it possible to do it with OVModelForCausalLM?

Mar 29 '24 01:03 nai-kon

You probably can't do that once the model is loaded/exported to OpenVINO (or any framework with static computation graph), what you probably can do is to have one or many adapters fused into a base model and export it to OpenVINO.

Load the model using AutoModelForCausalLM
Load you target adapters in the way you would usually do
Create PeftModel and use its merge_and_unload method to fuse the adapters into the base model
Save the fused model locally or push to the Hub.
Load it with OVModelForCausalLM

Apr 10 '24 08:04 IlyasMoutawwakil

Thank you for your detailed answer. I understood that dynamic switching is difficult with static graph frameworks such as OpenVINO.

Apr 14 '24 23:04 nai-kon

Additional question. I understood that multiple adapters can be merged into the model using merge_and_unload(), but is it possible to load a model contains multiple adapters with OVModelForCausalLM and change the adapter? Or do I need to merge one adapter into one model, so if there are three adapters, do I need to prepare three merged models? If so, my concern is that the file size of the model will increase in proportion to the number of adapters.

Apr 15 '24 00:04 nai-kon

for now it seems impossible to me, but my understanding of the OpenVINO runtime is still very narrow. This issue https://github.com/openvinotoolkit/openvino/issues/21806 requests an API that tries to do this (keep the same model, change weights between inference requests). Apparently, in theory, it's supposed to be possible using the StateAPI, will look into it.

Apr 25 '24 08:04 IlyasMoutawwakil

Thank you for the information!

Aug 03 '24 12:08 nai-kon

optimum-intel optimum-intel copied to clipboard

How to apply LoRA adapter to a model loaded with OVModelForCausalLM()?

optimum-intel
optimum-intel copied to clipboard