llama-stack issues

[RFC] Preprocessing endpoint for RAG and other uses

6

### 🚀 Describe the new functionality needed ### Summary As discussed in #1061, this RFC introduces the design and endpoint specification for managing, invoking and serving document preprocessors. A preprocessor...

ilya-kolchinsky

enhancement

RAG

stale

[RFC] Integrate Hyperparameter Optimization into Llama Stack

5

### 🚀 Describe the new functionality needed This issue proposes integrating Hyperparameter Optimization (HPO) into the Llama stack to enhance model performance tuning and improve efficiency in parameter selection. ###...

varshaprasad96

enhancement

RAG

stale

Improve RAG as attachment behaviour in agent

3

### 🚀 Describe the new functionality needed - We currently perform adhoc preprocessing & ingesting with documents as attachment in agent on the fly. Code Pointer: https://github.com/meta-llama/llama-stack/blob/33b096cc21e48910cf05f0c3e513032adb99fa84/llama_stack/providers/inline/agents/meta_reference/agent_instance.py#L922-L930 - We should...

yanxi0830

enhancement

RAG

stale

Quick Start steps result in sqlite3 error

6

### System Info ``` PyTorch version: 2.6.0+cu124 Is debug build: False CUDA used to build PyTorch: 12.4 ROCM used to build PyTorch: N/A OS: Fedora Linux 40 (Workstation Edition) (x86_64)...

nathan-weinberg

bug

stale

feat: Implement keyword search in milvus

5

# What does this PR do? This PR adds the keyword search implementation for Milvus. Along with the implementation for remote Milvus, the tests require us to start a Milvus...

varshaprasad96

CLA Signed

Multiple providers blocking the async event loop

8

### 🐛 Describe the bug Llama Stack uses FastAPI and an async event loop. FastAPI uses a single event loop to dispatch requests to all async request handlers. If this...

bbrowning

bug

Connecting to MCP tools results in an internal server error

3

### System Info GPU Type: NVIDIA A100 OS: Ubuntu 24.04 CUDA: 12.8 ### Information - [ ] The official example scripts - [x] My own modified scripts ### 🐛 Describe...

rchaganti

bug

stale

Error Running Llama Stack Playground

### System Info PyTorch version: 2.7.0+cpu Is debug build: False CUDA used to build PyTorch: None ROCM used to build PyTorch: N/A OS: Ubuntu 22.04.5 LTS (x86_64) GCC version: (Ubuntu...

FranzRome

bug

Provide better control over the RAG ingestion stages (conversion, chunking, embedding, storing)

8

### 🚀 Describe the new functionality needed As of now, the RAG ingestion documents chunks the documents using a trivial algorithm of overlapping chunks and converts PDFs (and PDFs only)...

ilya-kolchinsky

enhancement

RAG

stale

chore: convert blocking calls to async calls in some providers

2

# What does this PR do? Converts blocking calls to async calls within the following providers/components: - runpod (inference) - sentence_transformers (inference) - litellm (inference) [//]: # (If resolving an...

jaideepr97

CLA Signed

llama-stack
llama-stack copied to clipboard

Metadata

[RFC] Preprocessing endpoint for RAG and other uses

[RFC] Integrate Hyperparameter Optimization into Llama Stack

Improve RAG as attachment behaviour in agent

Quick Start steps result in sqlite3 error

feat: Implement keyword search in milvus

Multiple providers blocking the async event loop

Connecting to MCP tools results in an internal server error

Error Running Llama Stack Playground

Provide better control over the RAG ingestion stages (conversion, chunking, embedding, storing)

chore: convert blocking calls to async calls in some providers

← Metadata

Owner

Metadata

llama-stack llama-stack copied to clipboard

Metadata

← Metadata

Owner

Metadata

llama-stack
llama-stack copied to clipboard