haystack-core-integrations
haystack-core-integrations copied to clipboard
PgvectorDocumentStore - an error occurred while running the sample code in the official website documentation
I am running the sample code of PgvectorDocumentStore, as follows:
The above is an example of the PgvectorDocumentStore section of the document. I executed it intact and encountered the following error:
File "F:\AWorks\work_python\ragkb\test\PythonTest.py", line 407, in
By locating the source code, I found that when document is converted into a dictionary, the old field dataframe is still needed. However, I am using version 2.13.1 and found that the dataframe field is ignored when it is converted into a dictionary. How can I solve this problem?
use haystack-ai 2.13.1
Hello!
Upgrading the pgvector integrations should fix this:
pip install -U pgvector-haystack
thank you
solved?
When using version 2.6.1, I can subclass and override the _from_haystack_to_pg_documents method of PgvectorDocumentStore to add custom fields in the database to db_documents. After updating to version 2.13.1, I found that _from_haystack_to_pg_documents no longer belongs to the PgvectorDocumentStore class. I re-read the documentation and still don't know how to solve it.
Ah, this seems like a different problem and you probably need a monkeypatch to do that.
Something like:
import haystack_integrations.document_stores.pgvector
def mycustom_from_haystack_to_pg_documents(documents: List[Document]) -> List[Dict[str, Any]]:
...
pgvector._from_haystack_to_pg_documents = mycustom_from_haystack_to_pg_documents
In any case, since this method is internal and an implementation detail, there is no guarantee that it will be there in the future... Out of curiosity, what are you trying to achieve?
This is because I encountered some problems when using PgvectorDocumentStore to write documents into the vector database. I initialized PgvectorDocumentStore: document_store = CustomPgvectorDocumentStore( table_name="my_table", embedding_dimension=1024, vector_function="cosine_similarity", recreate_table=False, search_strategy="hnsw",
) Then I want to use document_store.write_documents to write documents into my_table, but there are some fields in my_table that are not in haystack, such as docId, docType, etc. I put the above custom fields into the Document's meta, which causes the code to report an error: Traceback (most recent call last): File "F:\AWorks\work_python\ragkb.venv\lib\site-packages\haystack\core\pipeline\pipeline.py", line 60, in _run_component component_output = instance.run(**inputs) File "F:\AWorks\work_python\ragkb.venv\lib\site-packages\haystack\components\writers\document_writer.py", line 100, in run documents_written = self.document_store.write_documents(documents=documents, policy=policy) File "F:\AWorks\work_python\ragkb.venv\lib\site-packages\haystack_integrations\document_stores\pgvector\document_store.py", line 805, in write_documents raise DocumentStoreError(error_msg) from e haystack.document_stores.errors.errors.DocumentStoreError: Could not write documents to PgvectorDocumentStore. Error: ProgrammingError('query parameter missing: dataframe, docId, docType'). You can find the SQL query and the parameters in the debug logs.
By looking at the source code, I know that db_documents in self._cursor.executemany(sql_insert, db_documents, returning=True) does not contain the custom fields such as docId and docType mentioned above, so this is why I rewrote the _from_haystack_to_pg_documents method in version 2.6.1. This method allows the db_documents I output to include the custom fields in my_table
In version 2.13.1, _from_haystack_to_pg_documents no longer belongs to PgvectorDocumentStore, but in converters.py
The current solution is to inherit PgvectorDocumentStore, rewrite write_documents, and overwrite _from_haystack_to_pg_documents