mem0 icon indicating copy to clipboard operation
mem0 copied to clipboard

FAISS Index-Docstore Inconsistency Issue

Open shenshiqiSSQ opened this issue 5 months ago • 2 comments

🐛 Describe the bug

When I checked the integration status of the FAISS vector library, I found that deletions and additions caused the stored index to become misaligned, resulting in some data being missed during the search process.

Image Normally, the dimensions of the `docstore` and `index_to_id` read from a `.pkl` file should be consistent. I found that they are not consistent, which might be caused by updates after deletions.

shenshiqiSSQ avatar Jul 30 '25 09:07 shenshiqiSSQ

ouch yeah this one hurts — classic mismatch between index_to_id and docstore.

faiss itself doesn't manage the metadata integrity, so if you're doing delete + re-add + save/load, you’re likely to desync the index and the docstore. the .index_to_id list no longer aligns with what’s in the docstore dict, and your queries start pulling ghosts or missing valid docs.

if you want to patch around it short-term:

  • always rebuild the index and docstore together after major deletions
  • or manually re-sync index_to_id with docstore.keys() before saving

but long term? yeah... the vectorstore abstraction should really have better integrity guarantees — or at least throw when misaligned.

you're not alone tho. many setups hit this and never realize why retrieval fails silently. this bug likes to wear invisibility cloak.

hope this helps before you start doubting reality. ^^

also we have listed 16 common failures, if you need it , tell me . MIT License

onestardao avatar Aug 07 '25 09:08 onestardao

I have also encountered this issue. You can resolve it by adding the specified code here.

change:

if index_to_delete is not None:
    self.docstore.pop(vector_id, None)
    self.index_to_id.pop(index_to_delete, None)

    self._save()

    logger.info(f"Deleted vector {vector_id} from collection {self.collection_name}")

to:

if index_to_delete is not None:
    self.docstore.pop(vector_id, None)
    self.index_to_id.pop(index_to_delete, None)  
            
    assert len(self.docstore) == len(self.index_to_id), "Error in Faiss delete(), #doc and #index dis matchable!"
    tmp = {i: value for i, value in enumerate(self.index_to_id.values())}
    self.index_to_id = tmp
    
    self._save()

    logger.info(f"Deleted vector {vector_id} from collection {self.collection_name}")

Mr-Potential avatar Oct 17 '25 06:10 Mr-Potential