meilisearch-rust icon indicating copy to clipboard operation
meilisearch-rust copied to clipboard

Support `embedders` setting and other vector/hybrid search related configuration

Open CommanderStorm opened this issue 1 year ago • 19 comments

Pull Request

Related issue

Fixes https://github.com/meilisearch/meilisearch-rust/issues/541 Fixes https://github.com/meilisearch/meilisearch-rust/issues/612 Fixes https://github.com/meilisearch/meilisearch-rust/issues/621 Fixes https://github.com/meilisearch/meilisearch-rust/issues/646

What does this PR do?

  • Adds the required settings

    • with_embedders does use the same "API" (not using impl AsRef for items passed) as with_synonyms, as this is the closest existing
    • given set_embedders has not been implemented upstream (at least when I try to PATCH the object, it does not work)
    • only {get,reset}_embedders settings have been implemented. Said implementation goes with the work done in https://github.com/meilisearch/meilisearch-python/pull/924
  • adds the hybrid field to search via the vector search to add an end-to-end test of this feature with the huggingface configuration.

    userProvided seens more brittle, but we may want change to this instead using userProvided instead would mean (at the cost of hardcoding stuff) => lower cpu effort => no higher timeout necceeseary => aligning with meilisearch/meilisearch to only have a test for userProvided)

TODO:

  • [x] find a combination of semantic search model + configuration that does not fail the assumptions (see search testcase) spectacularly

PR checklist

Please check if your PR fulfills the following requirements:

  • [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
  • [x] Have you read the contributing guidelines?
  • [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!

Summary by CodeRabbit

  • New Features

    • Added support for hybrid semantic search, allowing users to combine keyword and semantic search with customizable parameters.
    • Introduced the ability to provide custom embedding vectors in search queries and retrieve vectors in search results.
    • Added comprehensive configuration options for semantic search embedders, supporting multiple providers (HuggingFace, OpenAI, Ollama, REST, and user-provided).
    • Enabled management of embedders through new settings and API methods, including fetching, setting, and resetting embedder configurations.
  • Documentation

    • Added detailed usage examples and documentation for new semantic search and embedder configuration features.
  • Tests

    • Introduced new tests to verify hybrid search, vector retrieval, and embedder management functionalities.

CommanderStorm avatar Mar 02 '24 15:03 CommanderStorm