generative-ai-cdk-constructs (sagemaker-model-deployment): Support AWS Inferentia instance types

Describe the feature

I'd like to be able to use AWS Inferentia for my endpoint inference containers

Use Case

Proposed Solution

No response

Other Information

No response

Acknowledgements

[X] I may be able to implement this feature request
[ ] This feature might incur a breaking change

Feb 28 '24 13:02 kukushking

Hi @kukushking , we do have 2 samples demonstrating how to deploy models on Inferentia :

https://github.com/aws-samples/generative-ai-cdk-constructs-samples/tree/main/samples/sagemaker_huggingface_inferentia
https://github.com/aws-samples/generative-ai-cdk-constructs-samples/tree/main/samples/sagemaker_custom_endpoint

Is there a model in particular you are looking to deploy ?

Feb 28 '24 15:02 krokoko

Thanks @krokoko. Is this supported for JumpStart Foundation Models? I don't see Inf instance types here

Feb 28 '24 17:02 kukushking

Ah sorry just found it. Nvm, closing the issue.

Feb 28 '24 17:02 kukushking

Error: The instance type ml.inf1.2xlarge is not supported. Default instance type: ml.g5.2xlarge. Supported instance types: ml.g5.2xlarge, ml.g5.4xlarge, ml.g5.8xlarge, ml.g5.16xlarge.

I get this error when deploying Mistral 7B with Inf. Am I doing something wrong?

Feb 28 '24 17:02 kukushking

Could you please share the code snippet you are using ?

Feb 28 '24 17:02 krokoko

Sure, here is the link.

Mar 07 '24 16:03 kukushking

Thanks @kukushking , will add it to the backlog

Mar 11 '24 18:03 krokoko

This issue is now marked as stale because it hasn't seen activity for a while. Add a comment or it will be closed soon. If you wish to exclude this issue from being marked as stale, add the "backlog" label.

May 11 '24 01:05 github-actions[bot]

Closing this issue as it hasn't seen activity for a while. Please add a comment @mentioning a maintainer to reopen. If you wish to exclude this issue from being marked as stale, add the "backlog" label.

May 18 '24 01:05 github-actions[bot]

Closing this issue as it hasn't seen activity for a while. Please add a comment @mentioning a maintainer to reopen. If you wish to exclude this issue from being marked as stale, add the "backlog" label.

May 28 '24 01:05 github-actions[bot]