llama-stack
llama-stack copied to clipboard
Add sambanova provider
Why this PR
We want to setup SambaNova as a remote inference provider for llama-stack.
What is in the PR
Integration with distribution OpenAI as a client
How I tested
- Start the distribution
llama stack run remote_sambanova --port 12345
- Invoke the call (non-streaming)
curl -X POST http://localhost:12345/inference/chat_completion -H "Content-Type: application/json" -d {"model":"Llama3.1-8B-Instruct","messages":[{"content":"hello world, write me a 2 sentence poem about the moon", "role": "user"}],"stream":false}'
and the response is
data: {"completion_message":{"role":"assistant","content":"Here's a 2-sentence poem about the moon:\n\nThe moon glows softly in the midnight sky,\nA beacon of wonder, as it passes by.","stop_reason":"end_of_turn","tool_calls":[]},"logprobs":null}
- Invoke the call (streaming)
curl -X POST http://localhost:12345/inference/chat_completion -H "Content-Type: application/json" -d '{"model":"Llama3.1-8B-Instruct","messages":[{"content":"hello world, write me a 2 sentence poem about the moon", "role": "user"}],"stream":true}'
and the response is
data: {"event":{"event_type":"start","delta":"","logprobs":null,"stop_reason":null}}
data: {"event":{"event_type":"progress","delta":"","logprobs":null,"stop_reason":null}}
data: {"event":{"event_type":"progress","delta":"","logprobs":null,"stop_reason":null}}
data: {"event":{"event_type":"progress","delta":"Here's a 2-sentence poem about the moon:\n\nThe moon glows softly in the ","logprobs":null,"stop_reason":null}}
data: {"event":{"event_type":"progress","delta":"midnight sky,\nA beacon of wonder, as it ","logprobs":null,"stop_reason":null}}
data: {"event":{"event_type":"progress","delta":"passes by.","logprobs":null,"stop_reason":null}}
data: {"event":{"event_type":"progress","delta":"","logprobs":null,"stop_reason":null}}
data: {"event":{"event_type":"complete","delta":"","logprobs":null,"stop_reason":"end_of_turn"}}