langchain
langchain copied to clipboard
feat: filter on list of values
Issue
The old filter could only filter on a single value, as it used the ElasticSearch match query.
Description
This change would allow the user to filter on a list of values using the ElasticSearch terms query. Fixes: https://github.com/hwchase17/langchain/issues/2095#issuecomment-1536225159
could we add an integration test with a filter to make sure things work?
I have written integration test with a filter.
def test_filter_query(self, elasticsearch_url: str) -> None:
texts = ["foo", "bar", "baz", "hello", "igloo", "sharks"]
metadatas = [{"page": i} for i in range(len(texts))]
docsearch = ElasticVectorSearch.from_texts(
texts,
FakeEmbeddings(),
metadatas=metadatas,
elasticsearch_url=elasticsearch_url,
)
search_result = docsearch.similarity_search_by_vector(
"sharks", k=1, filter={"page": [5, 6]}
)
assert len(search_result) != 0
However, I am unable to test, as when i run
make integration_tests
I get the following error.
Could I please get some help on this?
Hey everyone, there is a workaround for this filtering.
vec = VectorStoreRetriever(vectorstore=vectorstore, search_kwargs={"where_document":{"$or": [{"$contains": "search_string_1"}, {"$contains": "search_string_1"}]}})
@vibha0411 you probably don't want to run all integration tests (that requires a bunch of optional imports and access tokens). you can run just the file you've changed with
pytest tests/integration_tets/<PATH_TO_FILE>.py
@pedrobuenoxs very cool, good to know! my sense is @vibha0411's change may still be good for ease of use
stale, this elastic vector store being deprecated