langchain
langchain copied to clipboard
Issue: Weaviate: why similarity_search uses with_near_text?
Issue you'd like to raise.
similarity_search in weaviate uses near_text.
This requires weaviate to be set up with a text2vec module.
At the same time, the weaviate also takes an embedding model as one of it's init parameters.
Why don't we use the embedding model to vectorize the search query and then use weaviate's near_vector operator to do the search?
Suggestion:
If a user is using langchain with weaviate, we can assume that they want to use langchain's features to generate the embeddings and as such, will not have any text2vec module enabled.
what's the concrete suggestion here (like the code change)?
if the team agrees with the suggestion, then the code change is to rewrite similarity_search so that it embeds the query using the embedding model the class was initialised with and then passes it to similarity_search_by_vector
i'm for it. feel free to open a PR, otherwise I can get to it in a few hours
can we have a way to toggle between the two? eg keep both functionality and let the user decide?
can we have a way to toggle between the two? eg keep both functionality and let the user decide?
@hwchase17 sure. PR #4365 supports this requirement. We can close this issue once that PR is merged