sagemaker-python-sdk icon indicating copy to clipboard operation
sagemaker-python-sdk copied to clipboard

Cannot deploy Huggingface model onto serverless endpoint

Open Peter-Devine opened this issue 3 years ago • 3 comments

Describe the bug When trying to deploy my Huggingface model through:

    predictor = huggingface_model.deploy(
        endpoint_name = endpoint_name,
        serverless_inference_config = {
                "MemorySizeInMB": 1024,
                "MaxConcurrency": 2,
        }
    )

I get the following error:

File "/XXX/lib/python3.9/site-packages/sagemaker/huggingface/model.py", line 271, in deploy
    if not self.image_uri and instance_type.startswith("ml.inf"):
AttributeError: 'NoneType' object has no attribute 'startswith'

I think this is because Huggingface deploy currently assumes that an instance type is given (not ready for it being serverless). In the serverless case instance_type is None, but it uses string methods on instance_type here: https://github.com/aws/sagemaker-python-sdk/blob/f3c2d7ec56fb63878da978c1e58caf3771999218/src/sagemaker/huggingface/model.py#L271

Maybe a simple not is_serverless and at the start of this if statement would fix this? Or am I being dense?

Thanks!

Peter-Devine avatar Mar 18 '22 23:03 Peter-Devine

Hi,

You need to provide an instance_type, as the default value is None, that's why you are getting the error AttributeError: 'NoneType' object has no attribute 'startswith'. See the deploy method signature in the doc here.

deploy(initial_instance_count=None, instance_type=None, serializer=None, deserializer=None, accelerator_type=None, endpoint_name=None, tags=None, kms_key=None, wait=True, data_capture_config=None, async_inference_config=None, serverless_inference_config=None, **kwargs)

You can find the list of available instance types here: https://aws.amazon.com/sagemaker/pricing/

mohamed-ali avatar Mar 19 '22 09:03 mohamed-ali

But if I want to make a serverless endpoint (as described here - https://aws.amazon.com/about-aws/whats-new/2021/12/amazon-sagemaker-serverless-inference/), then I cannot supply an instance type, as this option explicitly has no defined instance.

In the AWS tutorial provided for making a serverless endpoint (https://aws.amazon.com/blogs/machine-learning/deploying-ml-models-using-sagemaker-serverless-inference-preview/), under the heading "Endpoint configuration creation", there is no instance_type required:

endpoint_config_response = client.create_endpoint_config(
    EndpointConfigName=xgboost_epc_name,
    ProductionVariants=[
        {
        "VariantName": "byoVariant",
        "ModelName": model_name,
        "ServerlessConfig": {
        "MemorySizeInMB": 4096,
        "MaxConcurrency": 1,
        },
        },
    ],
)

I should be able to do this through HuggingfaceModel.deploy() too, but it seems that the API hasn't been updated to support this (relatively new) feature yet.

Peter-Devine avatar Mar 19 '22 21:03 Peter-Devine

Thanks for clarifying that you want to deploy in serverless mode.

In your case, you need to provide an image_uri. See how the image_uri is retreived in section "Setup and training" and used in section "Model creation" from this tutorial https://aws.amazon.com/blogs/machine-learning/deploying-ml-models-using-sagemaker-serverless-inference-preview/.

Here's an example of retrieving the uri for huggingface with tensorflow as base framework.

import sagemaker

sagemaker.image_uris.retrieve(
    framework="huggingface",
    region="eu-west-1",
    version="4.6.1",
    py_version="py37",
    image_scope='inference',
    instance_type="ml.m5.2xlarge",
    base_framework_version='tensorflow2.4.1'
)
# gives a uri such as: '763104351884.dkr.ecr.eu-west-1.amazonaws.com/huggingface-tensorflow-inference:2.4.1-transformers4.6.1-cpu-py37-ubuntu18.04' 

mohamed-ali avatar Mar 21 '22 15:03 mohamed-ali

@Peter-Devine does this solve your issue?

davidbrochart avatar Sep 21 '23 15:09 davidbrochart

@Peter-Devine We are closing this issue due to inactivity. Please feel free to reopen the issue if suggested solution doesn't solve the issue for you. Thanks!

knikure avatar Dec 14 '23 14:12 knikure