byaldi icon indicating copy to clipboard operation
byaldi copied to clipboard

Error : "Document ID 0 with page ID 1 already exists in the index"

Open Leflak opened this issue 1 year ago • 1 comments

Hi, the error "Document ID 0 with page ID 1 already exists in the index" happens when I create an index with the same files as a previous one, even with overwrite=True.

Deepseek helped me, added following lines: self.embed_id_to_doc_id = {} self.indexed_embeddings = [] self.doc_ids_to_file_names = {} self.doc_id_to_metadata = {} self.highest_doc_id = -1

.. after the line 317 of colpali.py "shutil.rmtree(index_path)". This seems to allow to really delete existing index in memory and not just the folder.

Sorry if not proper way to raise that I am not a dev and do no understand anything to github.

Leflak avatar Sep 30 '24 19:09 Leflak

@Leflak Awesome solution.

@bclavie, Can you please resolve this issue in the next version?

behroozazarkhalili avatar Jan 31 '25 23:01 behroozazarkhalili