text-generation-inference
text-generation-inference copied to clipboard
Deploy error for Llama-3.2-vision-11B: "Sharded is not supported for AutoModel"
System Info
Hi Team,
When deploying the model on AWS with huggingface-pytorch-tgi-inference:2.3.0-tgi2.2.0
, I got the above error.
Could you tell me when can TGI provide the new image? Is there any way I can work around the issue for the moment?
Information
- [X] Docker
- [ ] The CLI directly
Tasks
- [ ] An officially supported command
- [ ] My own modifications
Reproduction
Run the image huggingface-pytorch-tgi-inference:2.3.0-tgi2.2.0 on Sagemaker.
Expected behavior
TGI can deploy the Llama3.2 model successfully