pr-agent icon indicating copy to clipboard operation
pr-agent copied to clipboard

[Feature Request] Allow users to connect pr_agent to an existing SageMaker inference endpoint

Open mattiaciollaro opened this issue 1 year ago • 4 comments

Context: assume a user has e.g. a pre-configured LLM inference endpoint in SageMaker (for example, a self-hosted Llama model as described here). It would be nice to be able to allow the user to configure pr-agent to leverage that endpoint e.g. by means of a dedicated AI handler.

Discord chat: https://discord.com/channels/1057273017547378788/1057273018084237344/1197261978591309884

cc: @krrishdholakia

mattiaciollaro avatar Jan 18 '24 18:01 mattiaciollaro

@krrishdholakia do you think this request is feasible ?

mrT23 avatar Jan 23 '24 19:01 mrT23

Hey @mattiaciollaro @mrT23 we already support sagemaker - https://docs.litellm.ai/docs/providers/aws_sagemaker

What am i missing?

krrishdholakia avatar Jan 23 '24 21:01 krrishdholakia

@mattiaciollaro is this PR still relevant?

mrT23 avatar Jan 28 '24 07:01 mrT23

Sorry for the delay guys.

Hey @mattiaciollaro @mrT23 we already support sagemaker - https://docs.litellm.ai/docs/providers/aws_sagemaker

https://docs.litellm.ai/docs/providers/aws_sagemaker seems to support SageMaker JumpStart models specifically.

I am thinking of a different situation where a model is already deployed via SageMaker and a reference to the inference endpoint name is available (as in here). In that case, how can we instruct pr-agent to leverage the LLM behind that pre-existing endpoint? I am not sure I see a way of doing this via https://docs.litellm.ai/docs/providers/aws_sagemaker

In the context of a POC with my team, the way we accomplished this was to hack the pr-agent's default AI handler (which is the LiteLLM AI handler) and use the sagemaker SDK (specifically, the HF predictor to make requests to the pre-existing SageMaker endpoint.

I imagine a cleaner solution would be to implement a dedicated AI handler for this usecase?

@mattiaciollaro is this PR still relevant?

I don't have a PR out for this, but yes: I think the feature request is still relevant :) My apologies again for the delay!

mattiaciollaro avatar Jan 28 '24 18:01 mattiaciollaro