sagemaker-python-sdk Cannot deploy Huggingface model onto serverless endpoint

Describe the bug When trying to deploy my Huggingface model through:

    predictor = huggingface_model.deploy(
        endpoint_name = endpoint_name,
        serverless_inference_config = {
                "MemorySizeInMB": 1024,
                "MaxConcurrency": 2,
        }
    )

I get the following error:

File "/XXX/lib/python3.9/site-packages/sagemaker/huggingface/model.py", line 271, in deploy
    if not self.image_uri and instance_type.startswith("ml.inf"):
AttributeError: 'NoneType' object has no attribute 'startswith'

I think this is because Huggingface deploy currently assumes that an instance type is given (not ready for it being serverless). In the serverless case instance_type is None, but it uses string methods on instance_type here: https://github.com/aws/sagemaker-python-sdk/blob/f3c2d7ec56fb63878da978c1e58caf3771999218/src/sagemaker/huggingface/model.py#L271

Maybe a simple not is_serverless and at the start of this if statement would fix this? Or am I being dense?

Thanks!

Mar 18 '22 23:03 Peter-Devine

Hi,

You need to provide an instance_type, as the default value is None, that's why you are getting the error AttributeError: 'NoneType' object has no attribute 'startswith'. See the deploy method signature in the doc here.

deploy(initial_instance_count=None, instance_type=None, serializer=None, deserializer=None, accelerator_type=None, endpoint_name=None, tags=None, kms_key=None, wait=True, data_capture_config=None, async_inference_config=None, serverless_inference_config=None, **kwargs)

You can find the list of available instance types here: https://aws.amazon.com/sagemaker/pricing/

Mar 19 '22 09:03 mohamed-ali

But if I want to make a serverless endpoint (as described here - https://aws.amazon.com/about-aws/whats-new/2021/12/amazon-sagemaker-serverless-inference/), then I cannot supply an instance type, as this option explicitly has no defined instance.

In the AWS tutorial provided for making a serverless endpoint (https://aws.amazon.com/blogs/machine-learning/deploying-ml-models-using-sagemaker-serverless-inference-preview/), under the heading "Endpoint configuration creation", there is no instance_type required:

endpoint_config_response = client.create_endpoint_config(
    EndpointConfigName=xgboost_epc_name,
    ProductionVariants=[
        {
        "VariantName": "byoVariant",
        "ModelName": model_name,
        "ServerlessConfig": {
        "MemorySizeInMB": 4096,
        "MaxConcurrency": 1,
        },
        },
    ],
)

I should be able to do this through HuggingfaceModel.deploy() too, but it seems that the API hasn't been updated to support this (relatively new) feature yet.

Mar 19 '22 21:03 Peter-Devine

Thanks for clarifying that you want to deploy in serverless mode.

In your case, you need to provide an image_uri. See how the image_uri is retreived in section "Setup and training" and used in section "Model creation" from this tutorial https://aws.amazon.com/blogs/machine-learning/deploying-ml-models-using-sagemaker-serverless-inference-preview/.

Here's an example of retrieving the uri for huggingface with tensorflow as base framework.

import sagemaker

sagemaker.image_uris.retrieve(
    framework="huggingface",
    region="eu-west-1",
    version="4.6.1",
    py_version="py37",
    image_scope='inference',
    instance_type="ml.m5.2xlarge",
    base_framework_version='tensorflow2.4.1'
)
# gives a uri such as: '763104351884.dkr.ecr.eu-west-1.amazonaws.com/huggingface-tensorflow-inference:2.4.1-transformers4.6.1-cpu-py37-ubuntu18.04'

Mar 21 '22 15:03 mohamed-ali

@Peter-Devine does this solve your issue?

Sep 21 '23 15:09 davidbrochart

@Peter-Devine We are closing this issue due to inactivity. Please feel free to reopen the issue if suggested solution doesn't solve the issue for you. Thanks!

Dec 14 '23 14:12 knikure

sagemaker-python-sdk sagemaker-python-sdk copied to clipboard

Cannot deploy Huggingface model onto serverless endpoint

sagemaker-python-sdk
sagemaker-python-sdk copied to clipboard