sagemaker-inference-toolkit End-to-End example for *NON* multi model deployment

What did you find confusing? Please describe. Based on the documentation the complete example multi_model_bring_your_own, it seems that sagemaker-inference-toolkit is only for multi-model requirements. But I have also seen links for sagemaker_pytorch_serving_container which suggests that that is not the case.

There is no clear instruction in the documentation or an end-to-end example link indicating that it can be used for single model hosting scenarios as well.

Describe how documentation can be improved You can provide one more end-to-end example for single model hosting, along with some points in favor of using this python package, instead of designing our own docker containers from scratch.

Additional context

Jun 10 '20 12:06 nvs-abhilash

Hi @nvs-abhilash, Thanks for bringing this to our attention, we are planning to introduce an example of sagemaker-inference-toolkit + multi-model-server.

Meanwhile, you can refer to these instructions to implement your own handler service:

custom handler service: https://github.com/awslabs/multi-model-server/blob/master/docs/custom_service.md#example-custom-service-file

custom inference handler: https://github.com/aws/sagemaker-inference-toolkit/blob/master/README.md#implementation-steps

Jun 10 '20 15:06 chuyang-deng

Hi, my confusion is not how to use it with Multi Model Server, I think it is very well documented in the MMS SageMaker Example.

What we don't have is an example for "Bring Your Own Model" using sagemaker-inferece-toolkit.

As we don't have that, it looks like sagemaker-inferece-toolkit can only work with multi-model-server.

Maybe a sister example of scikit_bring_your_own, but using sagemaker-inferece-toolkit.

Let me know if that makes sense.

Jun 10 '20 17:06 nvs-abhilash

I am looking for the same information, and agree that such an example will be useful.

Sep 11 '20 20:09 mb-dev

I've been struggling for the same issue. Any update on this?

Dec 09 '20 09:12 RyoMazda

Hi, I've also been looking for examples of applying sagemaker-inference-toolkit for a single model deployment not using PyTorch. The best example which might also benefit others is the following repo from a AWS workshop: https://github.com/awslabs/amazon-sagemaker-mlops-workshop.

May 04 '21 08:05 DFranch

This is a particular issue when trying to host the model on the new serverless variant which does not support the multi-model hosting, see excluded features. Trying to host the model with my own flask API now.

Dec 16 '21 09:12 RichardSieg

Hi, I have the same issue now. It has been a few years. Any updates?

Jul 10 '23 05:07 Leon-Kani-Dong

sagemaker-inference-toolkit
sagemaker-inference-toolkit copied to clipboard

End-to-End example for NON multi model deployment

sagemaker-inference-toolkit sagemaker-inference-toolkit copied to clipboard

End-to-End example for *NON* multi model deployment

sagemaker-inference-toolkit
sagemaker-inference-toolkit copied to clipboard

End-to-End example for NON multi model deployment