haystack icon indicating copy to clipboard operation
haystack copied to clipboard

feat: FAISS in OpenSearch: Support HNSW for cosine

Open tstadel opened this issue 2 years ago • 1 comments

Related Issues

  • closes #2913

Proposed Changes:

  • normalize vectors manually at index time
  • normalize vectors manually at query time
  • use dot_product space under the hood
  • make normalization more efficient by using two-dimensional numpy array instead of normalizing each vector separately
  • run cosine tests for all document stores
  • run all document_store integration tests (not only OpenSearch)
  • fix test_faiss_and_milvus.py tests not running

pushed to another PR:

  • streamline validation of embedding shape across all document stores (BaseDocumentStore_validate_embeddings_shape)
  • introduce DenseRetriever abstraction to facilitate usage of retrievers in document store's update_embeddings

How did you test it?

  • made test test_cosine_similarity of test_faiss_and_milvus.py generic to run on all document stores

Notes for the reviewer

Checklist

tstadel avatar Sep 14 '22 16:09 tstadel

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB