R2R icon indicating copy to clipboard operation
R2R copied to clipboard

Issue when ingesting a file and adding it to a collection right after

Open jeremi opened this issue 1 year ago • 2 comments
trafficstars

Describe the bug

My code is like this:

                ingest_response = self.r2r.ingest_files(
                    file_paths=[document.file_path],
                    metadatas=[metadata]
                )
                document_id = ingest_response['results'][0]['document_id']
                try:
                    self.r2r.assign_document_to_collection(document_id, self.r2r.collection_id)
                except Exception as e:
                    logger.error(f"Error assigning document to collection: {str(e)}")

Oftentimes, I'll get an error when calling assign_document_to_collection because it cannot find the document in the database.

Looking at the code, the row in document_info is not created before the result of assign_document_to_collection is returned. It is done in the ingest-files workflow: https://github.com/SciPhi-AI/R2R/blob/c9be2c53728f4f1e4e300fde85777c38570f3141/py/core/main/api/ingestion_router.py#L164-L174

So when calling the assign_document_to_collection, the document_info record does not exist yet: https://github.com/SciPhi-AI/R2R/blob/c9be2c53728f4f1e4e300fde85777c38570f3141/py/core/providers/database/collection.py#L451-L462

How to solve this? Would it be possible to pass also the collection_id when calling the ingest_files? I noticed this workflow already added the file to the default collection.

jeremi avatar Oct 20 '24 14:10 jeremi