llama-stack
llama-stack copied to clipboard

Published 20 hours ago •

Reame
Issues

Adds SambaNova Cloud Inference Adapter

Open swanhtet1992 opened this issue 1 year ago • 2 comments

trafficstars

What does this PR do?

This PR adds a SambaNova inference adapter that enables integration with SambaNova's AI models through their OpenAI-compatible API.

Key features implemented:

Chat completion API with streaming support
Function/tool calling for supported models (3.1 series: 8B, 70B, 405B)
Support for multiple Llama 3 models (1B, 3B, 11B, 90B vision models, etc.)
Distribution template based on the examples

What it does not support:

Text completion API
Embeddings API
Certain OpenAI features (logprobs, presence/frequency penalties, parallel tool calls, etc.)
Response format options (JSON mode)

Test Plan

Run the following test command:

pytest -s -v --providers inference=sambanova llama_stack/providers/tests/inference/test_text_inference.py --env SAMBANOVA_API_KEY=<your-api-key>

To test the distribution template:

llama stack build --template sambanova --image-type conda
llama stack run ./run.yaml --port 5001 --env SAMBANOVA_API_KEY=<your-api-key>

Sources

SambaNova API Documentation

Before submitting

[ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
[ ] Ran pre-commit to handle lint / formatting issues.
[x] Read the contributor guideline, Pull Request section?
[ ] Updated relevant documentation.
[x] Wrote necessary unit or integration tests.

Nov 24 '24 08:11 swanhtet1992