llama-stack
llama-stack copied to clipboard
use task_type with vector db / rag tooling
🚀 Describe the new functionality needed
some embedding models are asymmetric, which means their best accuracy occurs when embedding for storage and query are handled differently.
EmbeddingRequest.task_type allows for using these asymmetric models.
when using a vector_db and rag_tool with an asymmetric model, insert should use the document task type and query should use the query task type.
💡 Why is this needed? What if we don't build it?
users will get poor accuracy from asymmetric embedding models.
Other thoughts
suggestion: indicate if task_type should be used when registering an embedding model. use task_type=document for /v1/tool-runtime/rag-tool/insert and task_type=query for /v1/tool-runtime/rag-tool/query