haystack icon indicating copy to clipboard operation
haystack copied to clipboard

feat: Support embedding dimensions on DeepsetCloudDocumentStore

Open dmigo opened this issue 2 years ago • 4 comments

Related Issues

  • fixes #issue-number

Proposed Changes:

  • Add embedding_dim to dc store
  • Remove similarity and return_embedding from query params, it is not used

How did you test it?

Notes for the reviewer

Checklist

dmigo avatar Aug 08 '22 14:08 dmigo

Removing return_embedding from get_document is fine for now. In the future we might support that in all document stores: https://github.com/deepset-ai/haystack/issues/3007. Currently it's just confusing as it has no effect.

tstadel avatar Aug 09 '22 13:08 tstadel

Sorry for the intrusion. I encountered the same problem with schema generation in the past. I add some information to be helpful...

  • the CI shows an error, requesting to locally update and commit the JSON schema
  • you try to generate the schema but the right schema is generated only if:
    • you have a full installation of Haystack (pip install -e '.[all]')
    • every module is working and importable (update_json_schema in generate_json_schema.py somewhere tries to import all the possible nodes and I found out that in my installation audio nodes were not working.)

anakin87 avatar Aug 09 '22 14:08 anakin87

@anakin87 Thanks for the hints! pip install -e '.[all]' is what I try to run at the moment, however some packages fail to install due to M1 probably. So, I'm debugging this behaviour right now.

dmigo avatar Aug 09 '22 14:08 dmigo

To generate a valid schma I did:

  1. brew install openblas
  2. brew upgrade cmake
  3. Excluded onnx from the list of dependencies
  4. GRPC_PYTHON_BUILD_SYSTEM_ZLIB=true OPENBLAS="$(brew --prefix openblas)" pip install -e '.[all]'
  5. python .github/utils/generate_json_schema.py

dmigo avatar Aug 09 '22 14:08 dmigo