text-generation-inference
text-generation-inference copied to clipboard
Deploying Falcon to SageMaker TGI DLC after QLoRA fine-tuning
Feature request
Hi,
I was able to deploy the base Falcon-40B model to SageMaker using the TGI DLC by following this blog post
I also recently fine-tuned the Falcon-40B model with QLoRA on SageMaker, and obtained the following files:
model/checkpoint-1000/adapter_model/adapter_model.bin model/checkpoint-1000/adapter_model/adapter_config.json
Now I'm wondering, is it currently possible to deploy this model with these adapter weights to TGI DLC on SageMaker? Otherwise, is it possible to deploy it to SageMaker without TGI?
Motivation
Feature request if this doesn't already exist.
Your contribution
🤷🏻♂️
Any update on this ?
Hey I don't know for sure. The most obvious way would be to "write" the lora directly in your model, creating an entirely new lora free model.
Not sure if/how it works with Qlora.
Has anyone tried running TGI on custom fine tuned models. I have added the lora weights back to the base model. Trying to figure out what should be the next steps ?
We're going to do that automatically for you soon: https://github.com/huggingface/text-generation-inference/pull/762
In the meantime: https://github.com/huggingface/text-generation-inference/issues/482#issuecomment-1602174068
Closing this in favor of #482