pydantic-ai
pydantic-ai copied to clipboard
Vector search and embeddings API
Currently we don't have anything, and the RAG example just uses OpenAI's plain API to generate embeddings.
It seems simple enough to add a dedicated API to models to generate embeddings, would wouldn't provide much on top of what the OpenAI SDK already offers, but would help a lot with Gemini where there's currently not interface in what we have.
I suspect that the vector search part is harder to provide an API for: anything beyond toy examples will require full control of the database being searched, and we're not (yet) building an ORM.
Am I wrong or missing something?
Can the retriever pattern be reused for vector search?
Can the
retrieverpattern be reused for vector search?
tool calls already work well for vector search, see https://ai.pydantic.dev/examples/rag/.
The main thing we can add is a model agnostic interface for creating embeddings.
Hey Samuel, is there any possibility that instead of agent.tools, pydantic-ai has proper API integration for vector db? so it would be much easier for the users to use retriever pattern. cc: @stephen37
Is there a possibility of having something like a standard API for vector search and then people can build adapters for specific vector db implementations? Perhaps a reference implementation for one or two of the more popular FOSS vector dbs could be included in PydanticAI to begin with, with community contributions of others welcome.
Here is a draft suggestion for the Vector Store APIs.
It could be a separate github and pypi project [pydantic-ai-vectorstores] with the ABCs in the code pydantic-ai so that implementers can extend the interface and implement vector store specific integrations that meet the minimum threshold specified in the project docs (500k downloads or something similar)
from abc import ABC, abstractmethod
from typing import Any
class Embeddings(ABC):
"""This is used to generate emebeddings for document chunks and query strings"""
@abstractmethod
async def vectorize_documents(self, document_chunks: list[str]) -> list[list[float]]:
"""This is used to generate document embeddings for a list of chunks"""
@abstractmethod
async def vectorize_query(self, text: str) -> list[float]:
"""This is used to generate embeddings for the query string"""
@abstractmethod
def vectorize_documents_sync(self, document_chunks: list[str]) -> list[list[float]]:
"""Synchronous version of vectorize_documents()"""
@abstractmethod
def vectorize_query_sync(self, text: str) -> list[float]:
"""Synchrounous version of vectorize_query()"""
class Document:
"""Represents a document or record added to a vector store"""
id: str
content: str
meta_data_fields: dict[str, Any]
class VectorStore(ABC):
"""Base class for vector store implementation"""
embeddings: Embeddings
@abstractmethod
async def add_documents(self, documents: list[Document], **kwargs: Any) -> list[str]:
"""Adds a list of documents to the vector store and returns their unique identifiers"""
@abstractmethod
async def add_document_chunks(self, documents: list[str], **kwargs: Any) -> list[str]:
"""This can use VectorStore.add_documents() to prepare records for the vector store insertion"""
@abstractmethod
async def delete_documents(self, document_ids: list[str]):
"""Deletes the specified list of documents by their record identifiers"""
@abstractmethod
async def search(self, query: str, search_type: str, **kwargs: Any) -> list[Document]:
"""Implementor can define a list of valid search types in sub classes. Can use VectorStore.search_with_embeddings() for search"""
@abstractmethod
async def search_with_embeddings(self, query: list[float], search_type: str, **kwargs: Any) -> list[Document]:
"""Implementor can define a list of valid search types in sub classes"""
@izzyacademy Great suggestion! I would like to expand on your suggestion with an additional idea.
Some packages also support customized responses with a retrieval query, where a query fragment can be passed as a variable to the search function, enabling more tailored responses. For example, in the case of Neo4j, the underlying search function query might look like this:
read_query = (
"CALL db.index.vector.queryNodes($index, $k, $embedding) "
"YIELD node, score "
) + retrieval_query
Here, retrieval_query can be used to customize the response, as demonstrated below:
retrieval_query = """
RETURN "Name:" + node.name AS text, score, {foo:"bar"} AS metadata
"""
Building on this idea, I propose the following structure for the search function:
@abstractmethod
async def search(self, query: str, search_type: str, retrieval_query:str | None, **kwargs: Any) -> list[Document]:
"""Implementor can define a list of valid search types in sub classes. Can use VectorStore.search_with_embeddings() for search"""
@abstractmethod
async def search_with_embeddings(self, query: list[float], search_type: str, retrieval_query:str | None, **kwargs: Any) -> list[Document]:
"""Implementor can define a list of valid search types in sub classes"""
This approach allows for greater flexibility and customization in tailoring search results.
@izzyacademy Can I suggest we also include a num_dimensions integer parameter or similar in the Embeddings class as that is useful for plumbing it in later on to various search systems. It should probably be a class property, but you get the idea.
class Embeddings(ABC):
"""This is used to generate emebeddings for document chunks and query strings"""
@property
@abstractmethod
def num_dimensions(self) -> int:
"""The number of dimensions in the resulting vector."""
@abstractmethod
async def vectorize_documents(self, document_chunks: list[str]) -> list[list[float]]:
"""This is used to generate document embeddings for a list of chunks"""
@abstractmethod
async def vectorize_query(self, text: str) -> list[float]:
"""This is used to generate embeddings for the query string"""
@abstractmethod
def vectorize_documents_sync(self, document_chunks: list[str]) -> list[list[float]]:
"""Synchronous version of vectorize_documents()"""
@abstractmethod
def vectorize_query_sync(self, text: str) -> list[float]:
"""Synchrounous version of vectorize_query()"""
Some text splitters are also interest to have... Maybe create some helper functions for it because the examples always assumes you already have your content chunked and embedded in the database
@izzyacademy Just wondering what the timing looks like for this on your roadmap? I need something soon and will roll my own if it's not looking like a new feature over the next few weeks. Thanks for all your hard work on this library. We are really enjoying using it.
@Kludex know you guys are busy with v1 release. But is there any plans for pydantic-ai to do RAG?
Hello!
Sentence Transformers maintainer here. I've been getting some requests whether Sentence Transformers can be used alongside Pydantic-AI for a RAG of sorts. Is this something that you would imagine supporting? I see that the majority (all?) of your AI functionality is over API calls at the moment, whereas Sentence Transformers is fully local in contrast. People tend not to realise that you can get e.g. 100 sentences/second processed on a CPU with solid embedding models like yesterday's EmbeddingGemma: https://huggingface.co/blog/embeddinggemma.
I'd love to be able to support this project alongside the other frameworks whose usage with embedding models was described in that blogpost.
- Tom Aarsen
@tomaarsen We're definitely interested in supporting that! I expect to dive into this in October. I'll find you if I have any questions!
All I need is raw embeddings. I already have my own vector database. Could you prioritize this part?
@gvanrossum Hi Guido, have you seen https://github.com/pydantic/pydantic-ai/pull/3252? It currently covers the OpenAI and Cohere embeddings APIs. I'd expect it to work with Azure OpenAI as well, but I'll double check. The PR still needs some work but you could try it out already. I hope to finish it next week.
@DouweM - To avoid regressions in typeagent I need only openai and azure/openai. Everything else is an improvement, as long as it supports embeddings -- I don't want my users to have to provide an openai key in addition to a key for whatever completion provider they want to use, if their provider supports embeddings.