sagemaker-inference-toolkit icon indicating copy to clipboard operation
sagemaker-inference-toolkit copied to clipboard

Deploying multiple model artifacts, each having their own inference handler

Open ghost opened this issue 4 years ago • 5 comments

What did you find confusing? Please describe. I am trying to deploy multiple tarball model artifacts, to a SageMaker multi-model endpoint, but would like to use different inference handlers for each model - since each model needs different pre-processing and post-processing.

Describe how documentation can be improved I see the documentation is fairly clear on how to specify a custom inference handler, but not clear on whether differing custom handlers can be specified for each model.

Additional context I discovered that a custom handler can be provided to the MMS model archiver here, but it's not clear if this allows different handlers for each model.

I love the inference toolkit, and would sincerely appreciate a response regarding whether it is possible to define differing inference handlers per model, and how to do so.

ghost avatar Jun 22 '20 11:06 ghost

thanks for the kind words! Unfortunately, this isn't currently supported at this time, but I'll leave this issue open as a feature request.

laurenyu avatar Jun 23 '20 17:06 laurenyu

Took a long time to figure out from reading the code that this isn't supported. Was experimenting with single models first and then wanted to move towards multi-model and unfortunate that it isn't supported.

manojlds avatar Jun 25 '20 19:06 manojlds

+1 for this

alext234 avatar Jul 17 '20 03:07 alext234

+1, it's super confusing how this is supposed to all fit together. One would assume that Sagemaker would support the same functionality of the Multi Model Server in the Inference Toolkit. Seems like the only option is to use separate endpoints after all.

ckang244 avatar Aug 12 '20 18:08 ckang244

any update on this? I'm trying to achieve similar thing. it will be great to have this toolkit support multiple models with each model have its own inference code.

n0thing233 avatar Aug 16 '21 16:08 n0thing233