Pydantic Validation Error in VectorStoreSearchResponse with List-Type Metadata
System Info
- Llama Stack Version: 0.2.23
- Python Version: 3.11
- Vector Store Backend: FAISS (affects all backends)
Information
- [ ] The official example scripts
- [ ] My own modified scripts
🐛 Describe the bug
Vector store search operations fail with Pydantic validation errors when chunk metadata contains list-type values (e.g., tags). This occurs because the VectorStoreSearchResponse model restricts attributes to only primitive types (str | float | bool), while the input Chunk.metadata accepts any type (dict[str, Any]).
Environment
- Llama Stack Version: [Your version]
- Python Version: 3.11
- Vector Store Backend: FAISS (affects all backends)
Steps to Reproduce
- Ingest documents with metadata containing lists:
chunks = [
Chunk(
content="Model information...",
metadata={
"tags": ["transformers", "h100-compatible", "region:us"],
"model_name": "granite-3.3-8b"
}
)
]
await vector_io.insert_chunks(vector_db_id, chunks)
- Search the vector store:
response = await vector_io.openai_search_vector_store(
vector_store_id="vs_...",
query="models compatible with H100 GPU",
max_num_results=10
)
Actual Behavior
The search returns empty results and logs show a Pydantic validation error:
[ERROR] Error searching vector store vs_6ce6f6c8-09b9-4e54-a4c5-3a78f7688805: 6 validation errors for VectorStoreSearchResponse
attributes.tags.str
Input should be a valid string [type=string_type, input_value=['transformers', 'safeten...ompatible', 'region:us'], input_type=list]
attributes.tags.float
Input should be a valid number [type=float_type, input_value=['transformers', 'safeten...ompatible', 'region:us'], input_type=list]
attributes.tags.bool
Input should be a valid boolean [type=bool_type, input_value=['transformers', 'safeten...ompatible', 'region:us'], input_type=list]
Expected Behavior
The search should return results with metadata intact, supporting the same flexible metadata types at retrieval that are accepted at ingestion.
Root Cause
Schema Mismatch:
Input Schema (llama_stack/apis/vector_io/vector_io.py:71):
class Chunk(BaseModel):
metadata: dict[str, Any] = Field(default_factory=dict) # ✅ Accepts ANY type
Output Schema (llama_stack/apis/vector_io/vector_io.py:250):
class VectorStoreSearchResponse(BaseModel):
attributes: dict[str, str | float | bool] | None = None # ❌ Only primitives
Direct Pass-through (llama_stack/providers/utils/memory/openai_vector_store_mixin.py:606):
response_data_item = VectorStoreSearchResponse(
file_id=chunk.metadata.get("document_id", ""),
filename=chunk.metadata.get("filename", ""),
score=score,
attributes=chunk.metadata, # ❌ No transformation/validation
content=content,
)
Error logs
[ERROR] Error searching vector store
vs_6ce6f6c8-09b9-4e54-a4c5-3a78f7688805: 6 validation errors for VectorStoreSearchResponse
attributes.tags.str
Input should be a valid string [type=string_type, input_value=['transformers', 'safeten...ompatible',
'region:us'], input_type=list]
For further information visit https://errors.pydantic.dev/2.12/v/string_type
attributes.tags.float
Input should be a valid number [type=float_type, input_value=['transformers', 'safeten...ompatible',
'region:us'], input_type=list]
For further information visit https://errors.pydantic.dev/2.12/v/float_type
attributes.tags.bool
Input should be a valid boolean [type=bool_type, input_value=['transformers', 'safeten...ompatible',
'region:us'], input_type=list]
For further information visit https://errors.pydantic.dev/2.12/v/bool_type
attributes.last_modified.str
Input should be a valid string
For further information visit https://errors.pydantic.dev/2.12/v/string_type
attributes.last_modified.float
Input should be a valid number
For further information visit https://errors.pydantic.dev/2.12/v/float_type
attributes.last_modified.bool
Input should be a valid boolean
For further information visit https://errors.pydantic.dev/2.12/v/bool_type
INFO 2025-10-12 18:49:26,916 console_span_processor:39 telemetry: 16:49:26.915 [END]
Expected behavior
No Error, search works.
Additionally to the fix itself as part of the PR.
the user can provide the following as argument {"tags": "tag0,tag1"}, later to be split again by the user output.split(',')
Fixed by https://github.com/llamastack/llama-stack/pull/4173?