haystack icon indicating copy to clipboard operation
haystack copied to clipboard

Error when delete_documents for new PineconeDocumentStore

Open jamescalam opened this issue 3 years ago • 1 comments

Describe the bug After initializing a PineconeDocumentStore, if we run delete_documents an error is raised (as it tries to access a non-existent index)

Error message

KeyError                                  Traceback (most recent call last)
[<ipython-input-5-825710cfbb3a>](https://localhost:8080/#) in <module>
      1 document_store.delete_all_documents()
      2 
----> 3 document_store.delete_documents()

[/usr/local/lib/python3.7/dist-packages/haystack/document_stores/pinecone.py](https://localhost:8080/#) in delete_documents(self, index, ids, filters, headers, drop_ids, namespace)
    960             # If no filters or IDs we delete everything
    961             self.pinecone_indexes[index].delete(delete_all=True, namespace=namespace)
--> 962             id_values = list(self.all_ids[index])
    963         else:
    964             if ids is None:

KeyError: 'haystack'

Expected behavior Nothing, maybe a warning that no index exists. Otherwise this should function

Additional context Add any other context about the problem here, like document types / preprocessing steps / settings of reader etc.

To Reproduce

from haystack.document_stores.pinecone import PineconeDocumentStore

ENV ='us-west1-gcp'
KEY = '<<API_KEY>>'

document_store = PineconeDocumentStore(
    api_key=KEY, environment=ENV,
    index="haystack",
    similarity="dot_product"
)

document_store.delete_all_documents()
document_store.delete_documents()

FAQ Check

System:

  • OS: Colab
  • Haystack version (commit or version number): From main
  • DocumentStore: Pinecone

jamescalam avatar Aug 26 '22 09:08 jamescalam

I will submit a fix soon

jamescalam avatar Aug 26 '22 09:08 jamescalam

I come across this bug too. Please fix it ASAP.

muazhari avatar Jan 24 '23 10:01 muazhari

I think the following check can be also introduced in pinecone.py's delete_documents avoiding the occurrence of the error :

        if index not in self.all_ids:
            self.all_ids[index] = set()

This way, there's no need to warn the user that the local(temp) dictionary does not have the index that the user just passed to PineconeDocumentStore's constructor, which in turn, creates it. (this warning would be confusing due to this from my perspective). Also, all delete_documents paths seem to be unaffected by this change. I might be missing something, any thoughts about this?

Namoush avatar Apr 04 '23 10:04 Namoush