generative-ai-cdk-constructs icon indicating copy to clipboard operation
generative-ai-cdk-constructs copied to clipboard

(sagemaker-model-deployment): Support AWS Inferentia instance types

Open kukushking opened this issue 1 year ago • 10 comments

Describe the feature

I'd like to be able to use AWS Inferentia for my endpoint inference containers

Use Case

Proposed Solution

No response

Other Information

No response

Acknowledgements

  • [X] I may be able to implement this feature request
  • [ ] This feature might incur a breaking change

kukushking avatar Feb 28 '24 13:02 kukushking

Hi @kukushking , we do have 2 samples demonstrating how to deploy models on Inferentia :

  • https://github.com/aws-samples/generative-ai-cdk-constructs-samples/tree/main/samples/sagemaker_huggingface_inferentia
  • https://github.com/aws-samples/generative-ai-cdk-constructs-samples/tree/main/samples/sagemaker_custom_endpoint

Is there a model in particular you are looking to deploy ?

krokoko avatar Feb 28 '24 15:02 krokoko

Thanks @krokoko. Is this supported for JumpStart Foundation Models? I don't see Inf instance types here

kukushking avatar Feb 28 '24 17:02 kukushking

Ah sorry just found it. Nvm, closing the issue.

kukushking avatar Feb 28 '24 17:02 kukushking

Error: The instance type ml.inf1.2xlarge is not supported. Default instance type: ml.g5.2xlarge. Supported instance types: ml.g5.2xlarge, ml.g5.4xlarge, ml.g5.8xlarge, ml.g5.16xlarge.

I get this error when deploying Mistral 7B with Inf. Am I doing something wrong?

kukushking avatar Feb 28 '24 17:02 kukushking

Could you please share the code snippet you are using ?

krokoko avatar Feb 28 '24 17:02 krokoko

Sure, here is the link.

kukushking avatar Mar 07 '24 16:03 kukushking

Thanks @kukushking , will add it to the backlog

krokoko avatar Mar 11 '24 18:03 krokoko

This issue is now marked as stale because it hasn't seen activity for a while. Add a comment or it will be closed soon. If you wish to exclude this issue from being marked as stale, add the "backlog" label.

github-actions[bot] avatar May 11 '24 01:05 github-actions[bot]

Closing this issue as it hasn't seen activity for a while. Please add a comment @mentioning a maintainer to reopen. If you wish to exclude this issue from being marked as stale, add the "backlog" label.

github-actions[bot] avatar May 18 '24 01:05 github-actions[bot]

Closing this issue as it hasn't seen activity for a while. Please add a comment @mentioning a maintainer to reopen. If you wish to exclude this issue from being marked as stale, add the "backlog" label.

github-actions[bot] avatar May 28 '24 01:05 github-actions[bot]