llama-stack
llama-stack copied to clipboard
Composable building blocks to build Llama Apps
### 🚀 Describe the new functionality needed Integrate Model Context Protocol (MCP) server deployment into the run.yaml configuration. ### 💡 Why is this needed? What if we don't build it?...
The GPU model usage blocks the CPU. Move it to its own thread. Also wrap in a lock to prevent multiple simultaneous run from exhausting the GPU. Closes: #1746 #...
### System Info ``` θ82° [thoraxe:~/.llama/distributions/remote-vllm] [ols-llamastack] 130 $ llama stack build --image-type venv > Enter a name for your Llama Stack (e.g. my-local-stack): stack > Enter the image type...
### 🚀 Describe the new functionality needed some embedding models are asymmetric, which means their best accuracy occurs when embedding for storage and query are handled differently. [EmbeddingRequest.task_type](https://github.com/meta-llama/llama-stack/blob/main/docs/_static/llama-stack-spec.yaml#L4319) allows for...
### 🚀 Describe the new functionality needed As mentioned in https://github.com/meta-llama/llama-stack/issues/1165#issuecomment-2670399136, there's interest in creating a comprehensive benchmark of VectorDBs for Llama Stack. The folks at @zilliztech have created https://github.com/zilliztech/VectorDBBench...
### System Info N/A ### Information - [ ] The official example scripts - [ ] My own modified scripts ### 🐛 Describe the bug Trying to run a distro...
# What does this PR do? The `load_tiktoken_bpe()` function depends on blobfile to load tokenizer.model files. However, blobfile brings in pycryptodomex, which is primarily used for JWT signing in GCP...
# What does this PR do? * Relocate redundant dependencies out of the core project and into the individual providers that actually require them. * Include all necessary server dependencies...
### 🚀 Describe the new functionality needed I noticed the change for https://github.com/meta-llama/llama-stack/pull/2153 is tagged [v0.2.8rc1](https://github.com/meta-llama/llama-stack/releases/tag/v0.2.8rc1). I would like to use this in docker, e.g. llamastack/distribution-ollama:0.2.8rc1 ### 💡 Why is...
### 🚀 Describe the new functionality needed We should require all builds passing before merging. ### 💡 Why is this needed? What if we don't build it? To avoid merging...