llama-stack
llama-stack copied to clipboard
Composable building blocks to build Llama Apps
# Add Sambanova Inference [Ssmabanova Inference](https://cloud.sambanova.ai/apis) provides a free Llama inference server, allowing me to expand the selection of inference options for my project. This pull request introduces a new...
### 🚀 Describe the new functionality needed - We should be able to persist localfs datasetio provider: https://github.com/meta-llama/llama-stack/blob/b1a63df8cdae6e45d1db10f8c73eca6cd75ba68e/llama_stack/providers/inline/datasetio/localfs/datasetio.py#L82-L98 - Example persistence for huggingface datasetio provider: https://github.com/meta-llama/llama-stack/blob/b1a63df8cdae6e45d1db10f8c73eca6cd75ba68e/llama_stack/providers/remote/datasetio/huggingface/huggingface.py#L55-L78 ### 💡 Why is...
### 🚀 The feature, motivation and pitch I see that llama-stack is becoming very powerful set of tool which sits on top of LLM model. inference, memory, agent, scoring, eval...
# What does this PR do? This PR adds a groq inference adapter. Key features implemented: - Chat completion API with streaming support - Distribution template for easy deployment What...
### 🚀 The feature, motivation and pitch (fireworks, together, meta-reference) support guided decoding (specifying a json-schema for example, as a "grammar" for decoding) with inference. vLLM supports this functionality --...
# What does this PR do? This PR adds a [Groq](https://console.groq.com/playground) inference provider that allows integration with Groq's AI inference offerings for Llama models. Groq has an OpenAI-compatible endpoint. Added...
Summary: Implementing Memory provider fakes as discussed in this draft https://github.com/meta-llama/llama-stack/pull/490#issuecomment-2492877393. High level changes: * Fake provider is specified via the "fake" mark * Test config will setup a fake...
### 🚀 The feature, motivation and pitch Redis KVStore has been relatively less used and does not yet have an integration test fixture . We need to add a fixture...
### System Info Python version: 3.11.10 | packaged by Anaconda, Inc. | (main, Oct 3 2024, 07:22:26) [MSC v.1929 64 bit (AMD64)] (64-bit runtime) Python platform: Windows-10-10.0.22631-SP0 Is CUDA available:...
### System Info llama_models 0.0.53 llama_stack 0.0.53 llama_stack_client 0.0.53 ### Information - [ ] The official example scripts - [ ] My own modified scripts ### 🐛 Describe the bug...