sagemaker-python-sdk Inference of PyTorch Model when mapped to Elastic Inference Accelarator is 15 times slow as compared to the CPU inference

Inference of PyTorch Model when mapped to Elastic Inference Accelarator is 15 times slow as compared to the CPU inference

Open Bilal-Yousaf opened this issue 1 year ago • 2 comments

Describe the bug Inference of PyTorch Model when mapped to Elastic Inference Accelarator is 15 times slow as compared to the CPU inference

To reproduce I am loading CLIP model in Elasitic Accelerator. I had to make some changes in CLIP code, and I am getting same exact output with EIA call as I am getting on CPU call but EIA inference is 15 times slow

Expected behavior EIA Call should be faster than CPU call

System information

Framework name (eg. PyTorch) or algorithm (eg. KMeans): PyTorch
Framework version: 1.5.1
Python version: 3.6
CPU or GPU: eia2.xlarge
Custom Docker image (Y/N): N using 763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-inference-eia

Additional context I am running the model using Docker Container

Jul 15 '22 22:07 Bilal-Yousaf

@qidewenwhen Could you please share feedback if there is any step that we need to perform in addition to what documentation says? This is urgent we have a client deployment in 2 days.

Jul 20 '22 17:07 Bilal-Yousaf

Hi @Bilal-Yousaf, sorry that this issue is beyond my knowledge, as this is out of the scope of my team the Sagemaker Pipeline. I just added the label - "component: hosting" to this issue.

Seems the issue is urgent, @navinsoni or @BasilBeirouti, could you please help to tag the POC of the hosting team or redirect this issue to them? I'm not sure who is the POC.

Jul 20 '22 19:07 qidewenwhen

@Bilal-Yousaf Thanks for reporting this issue. Please refer to Elastic Inference document for recommendation https://docs.aws.amazon.com/sagemaker/latest/dg/ei.html

Q: We currently use Amazon Elastic Inference (EI) accelerators. Will we be able to continue using them after April 15, 2023?

Yes, you will be able use Amazon EI accelerators. We recommend that you migrate your current ML Inference workloads running on Amazon EI to other hardware accelerator options at your earliest convenience.

Recommend going over the migration steps https://docs.aws.amazon.com/sagemaker/latest/dg/ei.html#ei-migration

Please reopen the issue if you see any issue with migration.

Dec 13 '23 19:12 mohanasudhan

sagemaker-python-sdk sagemaker-python-sdk copied to clipboard

Inference of PyTorch Model when mapped to Elastic Inference Accelarator is 15 times slow as compared to the CPU inference

sagemaker-python-sdk
sagemaker-python-sdk copied to clipboard