langchain Issue: Weaviate: why similarity_search uses with_near

Issue: Weaviate: why similarity_search uses with_near_text?

Open hsm207 opened this issue 2 years ago • 5 comments

Issue you'd like to raise.

similarity_search in weaviate uses near_text.

This requires weaviate to be set up with a text2vec module.

At the same time, the weaviate also takes an embedding model as one of it's init parameters.

Why don't we use the embedding model to vectorize the search query and then use weaviate's near_vector operator to do the search?

Suggestion:

If a user is using langchain with weaviate, we can assume that they want to use langchain's features to generate the embeddings and as such, will not have any text2vec module enabled.

May 15 '23 18:05 hsm207

what's the concrete suggestion here (like the code change)?

May 15 '23 20:05 dev2049

if the team agrees with the suggestion, then the code change is to rewrite similarity_search so that it embeds the query using the embedding model the class was initialised with and then passes it to similarity_search_by_vector

May 15 '23 20:05 hsm207

i'm for it. feel free to open a PR, otherwise I can get to it in a few hours

May 15 '23 20:05 dev2049

can we have a way to toggle between the two? eg keep both functionality and let the user decide?

May 15 '23 20:05 hwchase17

can we have a way to toggle between the two? eg keep both functionality and let the user decide?

@hwchase17 sure. PR #4365 supports this requirement. We can close this issue once that PR is merged

May 16 '23 12:05 hsm207

langchain langchain copied to clipboard

Issue: Weaviate: why similarity_search uses with_near_text?

Issue you'd like to raise.

Suggestion:

langchain
langchain copied to clipboard