haystack-core-integrations
haystack-core-integrations copied to clipboard
Pinecone: support hybrid retrieval
Supporting hybrid (sparse+dense embedding) retrieval should be quite simple, now that all abstractions and components are in place.
Sparse embedding retrieval in isolation should be investigated: Pinecone does not provide this feature out of the box.
@anakin87 Any update on this? Pinecone has supported sparse embedding retrieval for a bit.
In Pinecone, their "recommended" method is to have entirely separate dense and sparse embedding Indexes and then query from both of them for hybrid retrieval (I am using this method). Or you have the option of putting dense and sparse embeddings in the same index and then querying from one index. This information is from: https://docs.pinecone.io/guides/search/hybrid-search
I think PineconeDocumentStore / PineconeEmbeddingRetriever should reflect this, and give the option to query an Index with either dense or sparse, or both types of embeddings.
@anakin87
https://gist.github.com/EndreSzakal/96c0a4ada0b4558ffca52095968e7e1e
I created custom components that extend the existing pinecone document store and retriever and a new pinecone sparse embedder component to enable sparse embedding support in my own project. I wrote them to be similar to current haystack integrations, so hopefully they may be useful in an official release