llama-stack icon indicating copy to clipboard operation
llama-stack copied to clipboard

Adds SambaNova Cloud Inference Adapter

Open swanhtet1992 opened this issue 1 year ago • 2 comments
trafficstars

What does this PR do?

This PR adds a SambaNova inference adapter that enables integration with SambaNova's AI models through their OpenAI-compatible API.

Key features implemented:

  • Chat completion API with streaming support
  • Function/tool calling for supported models (3.1 series: 8B, 70B, 405B)
  • Support for multiple Llama 3 models (1B, 3B, 11B, 90B vision models, etc.)
  • Distribution template based on the examples

What it does not support:

  • Text completion API
  • Embeddings API
  • Certain OpenAI features (logprobs, presence/frequency penalties, parallel tool calls, etc.)
  • Response format options (JSON mode)

Test Plan

Run the following test command:

pytest -s -v --providers inference=sambanova llama_stack/providers/tests/inference/test_text_inference.py --env SAMBANOVA_API_KEY=<your-api-key>

To test the distribution template:

llama stack build --template sambanova --image-type conda
llama stack run ./run.yaml --port 5001 --env SAMBANOVA_API_KEY=<your-api-key>

Sources

Before submitting

  • [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • [ ] Ran pre-commit to handle lint / formatting issues.
  • [x] Read the contributor guideline, Pull Request section?
  • [ ] Updated relevant documentation.
  • [x] Wrote necessary unit or integration tests.

swanhtet1992 avatar Nov 24 '24 08:11 swanhtet1992