[feature-request] Huggingface multi-models support for GPU
Checklist
- [x] I've prepended issue tag with type of change: [feature]
- [ ] (If applicable) I've documented below the DLC image/dockerfile this relates to
- [ ] (If applicable) I've documented the tests I've run on the DLC image
- [x] I'm using an existing DLC image listed here: https://docs.aws.amazon.com/deep-learning-containers/latest/devguide/deep-learning-containers-images.html
- [ ] I've built my own container based off DLC (and I've attached the code used to build my own image)
Concise Description:
Hi Team,
when I use huggingface to deploy endpoint with mutil-model for GPU endpoint, it said
Your Ecr Image 763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-inference:1.10.2-transformers4.17.0-gpu-py38-cu113-ubuntu20.04 does not contain required com.amazonaws.sagemaker.capabilities.multi-models=true
Is there available huggingface image for GPU endpoint with mutil-model ?
DLC image/dockerfile:
763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-inference:1.10.2-transformers4.17.0-gpu-py38-cu113-ubuntu20.04
Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
This call returns error sm_client.create_model( ModelName=model_name, ExecutionRoleArn=role, Containers=[container] )
Describe the solution you'd like A clear and concise description of what you want to happen.
Being able to successfully create multi-model endpoint for huggingface inference gpu inference image.
Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
Additional context Add any other context or screenshots about the feature request here.
+1 on this feature