Please help. How can integrate MCP with model deployed on AWS Sagemaker? Much appreciate your help.
I have a model deployed on AWS Sagemaker and would like to use MCP with that model. possible?
Hi @aatish-shinde thank you for your question! This is related to #40, which @MattMorgis is working on.
If the API is openai-compatible, then you can also set the base_url endpoint to the instance and use that (though I don't think SageMaker in particular exposes that). Please see this example of using with Ollama models: https://github.com/lastmile-ai/mcp-agent/blob/main/examples/mcp_basic_ollama_agent/mcp_agent.config.yaml#L24.
@saqadri Thanks for the reply. I already have a Sagemaker client class using boto3 lib. I would like to use this class to get response from LLM. Any idea if that possible? It looks like this,
class SageMakerClient:
def __init__(self):
self.logger = logger
self.logger.info("Initializing SageMakerClient...")
self.client = boto3.client('sagemaker-runtime', region_name="us-east-2")
self.endpoint_name = SAGEMAKER_ENDPOINT
if not self.endpoint_name:
logger.warning("⚠️ SageMaker endpoint is not configured. Set SAGEMAKER_ENDPOINT in .env.")
def get_streaming_response(self, prompt):
"""
Sends a structured prompt to the SageMaker LLM endpoint and streams the response.
:param prompt: The structured prompt formatted as a JSON list.
:return: The streamed response as a string.
"""
try:
# Define inference parameters with streaming enabled
inference_params = {
"do_sample": True,
"temperature": 0.1,
"top_k": 50,
"max_new_tokens": 512,
"repetition_penalty": 1.03,
"stop": ["</s>", "<|system|>", "<|user|>", "<|assistant|>"],
"return_full_text": False
}
body = json.dumps({"inputs": prompt, "parameters": inference_params, "stream": True})
# Invoke SageMaker endpoint with response streaming
response = self.client.invoke_endpoint_with_response_stream(
EndpointName=self.endpoint_name,
Body=body,
ContentType="application/json"
)
event_stream = response["Body"]
return event_stream
except Exception as e:
self.logger.error(f"🚨 SageMaker error: {e}")
return "Error processing request. Please try again."
@MattMorgis do you have an update on supporting AWS hosted models? Or generally any suggestions for @aatish-shinde. I can look into this tomorrow @aatish-shinde
@saqadri Having this same convo over in #40