Querying vectordb with AgentAI
This PR has the initial functionality of querying a vectordb(using Chroma Db for now) with agentai.
A query model for Chroma looks like this ->
class Query(BaseModel):
"""Query Model to search the vector database. If query_embeddings is provided, query_texts will be ignored."""
query_embeddings: Optional[List[Embedding]] = Field(None, description="Embedding for the query to search")
query_texts: Optional[List[str]] = Field(None, description="Simplified query from the user to search")
k: int = Field(..., description="The number of results requested")
include: Include = Field(
["documents", "embeddings", "metadatas", "distances"], description="Data to include in results"
)
An example functionality of how we can do this ->
@tool(registry=db_registry)
def query_vector_db(query: Query):
"""
Ask the vector database a question
"""
print(f"Querying vector database: {query}")
results = client_db.get_docs(query=query)
return results
question = f"""Search for the content about where food comes from in the vector database.
Get me three results from the vector database and include the documents and distances."""
conversation = Conversation()
conversation.add_message(
"user",
question,
)
chat_response = chat_complete_execute_fn(conversation, tool_registry=db_registry, model="gpt-3.5-turbo")
print(chat_response)
Outputs ->
({'ids': [['90834f80-0432-475e-af9b-9688215db92d', 'a3c0e748-0937-46b1-a167-5aa01a70bbac', '81bed12d-1a84-4b4b-bd09-9fa964240278']], 'distances': [[0.7584866881370544, 1.0528839826583862, 1.372355341911316]], 'metadatas': None, 'embeddings': None, 'documents': [["CHAPTER.... 1 Agricultural Practices", "In order to provide food for a large population- regular production.. patterns can be identified.", "Storage\n1.3 ......Preparation of Soil"]]}, {'query': {'query_texts': ['where does food come from'], 'k': 3, 'include': ['documents', 'distances']}}, <function query_vector_db at 0x168888400>)
The Parsing capability of document is limited to pdfs with Unstructured and Azure Document Intelligence(Form Recognizer) for now. Can expand it as needed.
Some of the code is taken from other PRs that are currently open(which doesn't have to be reviewed in this PR). Do leave a comment after merging the ones earlier than this and I'll resolve the conflicts.
Files to review:
In Docs Folder:
- Two Notes books - One with Unstructured and the other with Azure Doc Intelligence
In Agentai folder:
- Two Files - parsers.py and vectordb.py
Check out this pull request on ![]()
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB