text-generation-inference icon indicating copy to clipboard operation
text-generation-inference copied to clipboard

Deploying Falcon to SageMaker TGI DLC after QLoRA fine-tuning

Open austinmw opened this issue 1 year ago • 2 comments

Feature request

Hi,

I was able to deploy the base Falcon-40B model to SageMaker using the TGI DLC by following this blog post

I also recently fine-tuned the Falcon-40B model with QLoRA on SageMaker, and obtained the following files:

model/checkpoint-1000/adapter_model/adapter_model.bin model/checkpoint-1000/adapter_model/adapter_config.json

Now I'm wondering, is it currently possible to deploy this model with these adapter weights to TGI DLC on SageMaker? Otherwise, is it possible to deploy it to SageMaker without TGI?

Motivation

Feature request if this doesn't already exist.

Your contribution

🤷🏻‍♂️

austinmw avatar Jun 14 '23 17:06 austinmw

Any update on this ?

Mohamedhabi avatar Jun 19 '23 11:06 Mohamedhabi

Hey I don't know for sure. The most obvious way would be to "write" the lora directly in your model, creating an entirely new lora free model.

Not sure if/how it works with Qlora.

Narsil avatar Jun 19 '23 11:06 Narsil

Has anyone tried running TGI on custom fine tuned models. I have added the lora weights back to the base model. Trying to figure out what should be the next steps ?

sarthak221995 avatar Aug 02 '23 08:08 sarthak221995

We're going to do that automatically for you soon: https://github.com/huggingface/text-generation-inference/pull/762

In the meantime: https://github.com/huggingface/text-generation-inference/issues/482#issuecomment-1602174068

Closing this in favor of #482

Narsil avatar Aug 03 '23 08:08 Narsil