text-generation-inference icon indicating copy to clipboard operation
text-generation-inference copied to clipboard

Deploy error for Llama-3.2-vision-11B: "Sharded is not supported for AutoModel"

Open xuan1905 opened this issue 5 months ago • 4 comments

System Info

Hi Team, When deploying the model on AWS with huggingface-pytorch-tgi-inference:2.3.0-tgi2.2.0, I got the above error. Could you tell me when can TGI provide the new image? Is there any way I can work around the issue for the moment?

Information

  • [X] Docker
  • [ ] The CLI directly

Tasks

  • [ ] An officially supported command
  • [ ] My own modifications

Reproduction

Run the image huggingface-pytorch-tgi-inference:2.3.0-tgi2.2.0 on Sagemaker.

Expected behavior

TGI can deploy the Llama3.2 model successfully

xuan1905 avatar Sep 26 '24 06:09 xuan1905