llama-stack icon indicating copy to clipboard operation
llama-stack copied to clipboard

use task_type with vector db / rag tooling

Open mattf opened this issue 7 months ago • 1 comments

🚀 Describe the new functionality needed

some embedding models are asymmetric, which means their best accuracy occurs when embedding for storage and query are handled differently.

EmbeddingRequest.task_type allows for using these asymmetric models.

when using a vector_db and rag_tool with an asymmetric model, insert should use the document task type and query should use the query task type.

💡 Why is this needed? What if we don't build it?

users will get poor accuracy from asymmetric embedding models.

Other thoughts

suggestion: indicate if task_type should be used when registering an embedding model. use task_type=document for /v1/tool-runtime/rag-tool/insert and task_type=query for /v1/tool-runtime/rag-tool/query

mattf avatar Mar 28 '25 00:03 mattf