R2R
R2R copied to clipboard
Issue when ingesting a file and adding it to a collection right after
Describe the bug
My code is like this:
ingest_response = self.r2r.ingest_files(
file_paths=[document.file_path],
metadatas=[metadata]
)
document_id = ingest_response['results'][0]['document_id']
try:
self.r2r.assign_document_to_collection(document_id, self.r2r.collection_id)
except Exception as e:
logger.error(f"Error assigning document to collection: {str(e)}")
Oftentimes, I'll get an error when calling assign_document_to_collection because it cannot find the document in the database.
Looking at the code, the row in document_info is not created before the result of assign_document_to_collection is returned. It is done in the ingest-files workflow:
https://github.com/SciPhi-AI/R2R/blob/c9be2c53728f4f1e4e300fde85777c38570f3141/py/core/main/api/ingestion_router.py#L164-L174
So when calling the assign_document_to_collection, the document_info record does not exist yet:
https://github.com/SciPhi-AI/R2R/blob/c9be2c53728f4f1e4e300fde85777c38570f3141/py/core/providers/database/collection.py#L451-L462
How to solve this? Would it be possible to pass also the collection_id when calling the ingest_files? I noticed this workflow already added the file to the default collection.