text-generation-inference Deploy error for Llama-3.2-vision-11B: "Sharded is not supported for AutoModel"

Deploy error for Llama-3.2-vision-11B: "Sharded is not supported for AutoModel"

Open xuan1905 opened this issue 5 months ago • 4 comments

System Info

Hi Team, When deploying the model on AWS with huggingface-pytorch-tgi-inference:2.3.0-tgi2.2.0, I got the above error. Could you tell me when can TGI provide the new image? Is there any way I can work around the issue for the moment?

Information

[X] Docker
[ ] The CLI directly

Tasks

[ ] An officially supported command
[ ] My own modifications

Reproduction

Run the image huggingface-pytorch-tgi-inference:2.3.0-tgi2.2.0 on Sagemaker.

Expected behavior

TGI can deploy the Llama3.2 model successfully

Sep 26 '24 06:09 xuan1905

text-generation-inference text-generation-inference copied to clipboard

Deploy error for Llama-3.2-vision-11B: "Sharded is not supported for AutoModel"

System Info

Information

Tasks

Reproduction

Expected behavior

text-generation-inference
text-generation-inference copied to clipboard