[feature-request] Huggingface multi-models support for GPU

Open robinsujob opened this issue 3 years ago • 1 comments

Checklist

[x] I've prepended issue tag with type of change: [feature]
[ ] (If applicable) I've documented below the DLC image/dockerfile this relates to
[ ] (If applicable) I've documented the tests I've run on the DLC image
[x] I'm using an existing DLC image listed here: https://docs.aws.amazon.com/deep-learning-containers/latest/devguide/deep-learning-containers-images.html
[ ] I've built my own container based off DLC (and I've attached the code used to build my own image)

Concise Description:

Hi Team, when I use huggingface to deploy endpoint with mutil-model for GPU endpoint, it said Your Ecr Image 763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-inference:1.10.2-transformers4.17.0-gpu-py38-cu113-ubuntu20.04 does not contain required com.amazonaws.sagemaker.capabilities.multi-models=true Is there available huggingface image for GPU endpoint with mutil-model ?

DLC image/dockerfile:

763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-inference:1.10.2-transformers4.17.0-gpu-py38-cu113-ubuntu20.04

Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

This call returns error sm_client.create_model( ModelName=model_name, ExecutionRoleArn=role, Containers=[container] )

Describe the solution you'd like A clear and concise description of what you want to happen.

Being able to successfully create multi-model endpoint for huggingface inference gpu inference image.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context or screenshots about the feature request here.

Nov 29 '22 14:11 robinsujob

+1 on this feature

Feb 19 '23 01:02 justinmclark