llama-stack
llama-stack copied to clipboard
Composable building blocks to build Llama Apps
# What does this PR do? This PR adds the inline vLLM inference provider to the regression tests for inference providers. The PR also fixes some regressions in that inference...
# What does this PR do? Add response format for agents structured output. - [ ] Using structured output for agents (interior_design app as an example) (#issue) https://github.com/meta-llama/llama-stack-apps/issues/122 ## Test...
### 🚀 Describe the new functionality needed For this request: ```python response = client.inference.chat_completion( model_id=MODEL_ID, messages=[ {"role": "user", "content": "Hello World"}, ], response_format={ "type": "json_schema", "json_schema": { "name": "Plan", "description":...
### 🚀 Describe the new functionality needed - See prerequisite issue in: https://github.com/meta-llama/llama-stack/issues/651 **Why** - We currently use providers/tests (to test impls) and tests/client-sdk to test SDK (via directClient &...
### 🚀 Describe the new functionality needed **Why** - We want a comprehensive & consolidated test suite covering all functionalities **What** - Audit existing tests on functionalities in providers/tests -...
# What does this PR do? Agents to use tools API ## Test Plan pytest -s -v -k fireworks llama_stack/providers/tests/agents/test_agents.py \ --safety-shield=meta-llama/Llama-Guard-3-8B \ --inference-model=meta-llama/Llama-3.1-8B-Instruct
### System Info Fedora linux OS to run llama stack build command. ### Information - [x] The official example scripts - [ ] My own modified scripts ### 🐛 Describe...
# What does this PR do? This adds some initial content documenting our OpenAI compatible APIs - Responses, Chat Completions, Completions, and Models - along with instructions on how to...
# What does this PR do? Enable ingestion of precomputed embeddings with `Chunks`. This PR enhances the Llama Stack vector database APIs, schemas, and documentation to allow users to supply...