multi-model-server [Q] GPU support

[Q] GPU support

Open oonisim opened this issue 5 years ago • 3 comments

AWS documentation (https://docs.aws.amazon.com/sagemaker/latest/dg/multi-model-endpoints.html) tells "Multi-model endpoints are not supported on GPU instance types.".

Kindly explain if it is not technically possible or not yet implemented.

Aug 17 '20 09:08 oonisim

Hi @oonisim

Do you know, how can we get the inference from multi-model endpoints which require GPU memory?

Thanks

Jan 23 '22 15:01 vinayak-shanawad

Hi @Vinayaks117 , As per AWS documentation (https://docs.aws.amazon.com/sagemaker/latest/dg/multi-model-endpoints.html] "Multi-model endpoints are not supported on GPU instance types", not sure if you can run multi model server (please see the AWS github for the multi model server implementation, and I believe it is framework e.g. PyTorch, TF dependent) on GPU instances. Please open a case to AWS support for a correct answer. I am afraid it is the only way.

Jan 23 '22 19:01 oonisim

Sure Thanks @oonisim

Jan 24 '22 09:01 vinayak-shanawad

multi-model-server multi-model-server copied to clipboard

[Q] GPU support

multi-model-server
multi-model-server copied to clipboard