langchain
langchain copied to clipboard
in ElasticKnnSearch added back create_index, add_texts, from_texts
Fixes https://github.com/hwchase17/langchain/issues/7117
Adding back create_index
, add_texts
, from_texts
to ElasticKnnSearch
Quick Test from langchain.vectorstores.elastic_vector_search import ElasticKnnSearch from langchain.embeddings import ElasticsearchEmbeddings
Initialize ElasticsearchEmbeddings
model_id = "sentence-transformers__all-distilroberta-v1" dims = 768 es_cloud_id = es_user = "" es_password = "" test_index = "knn_test_index_012"
embeddings = ElasticsearchEmbeddings.from_credentials( model_id, es_cloud_id=es_cloud_id, es_user=es_user, es_password=es_password, )
Initialize ElasticKnnSearch
knn_search = ElasticKnnSearch( es_cloud_id=es_cloud_id, es_user=es_user, es_password=es_password, index_name= test_index, embedding= embeddings )
Test adding vectors
Test add_texts
method when index is not created
texts = ["Hello, world!", "Machine learning is fun.", "I love Python."] knn_search.add_texts(texts)
Test from_texts
method when index is not created
new_texts = ["This is a new text.", "Elasticsearch is powerful.", "Python is fun."] knn_search.from_texts(new_texts, dims=768)
Correctly throw an exception when index has not been previously created.
# Test `add_texts` method
texts = ["Hello, world!", "Machine learning is fun.", "I love Python."]
knn_search.add_texts(texts)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/runner/langchain-1/langchain/vectorstores/elastic_vector_search.py", line 621, in add_texts
raise Exception(f"The index '{self.index_name}' does not exist. If you want to create a new index while encoding texts, call 'from_texts' instead.")
Exception: The index 'knn_test_index_012' does not exist. If you want to create a new index while encoding texts, call 'from_texts' instead.
Correctly create new index
# Test `from_texts` method
new_texts = ["This is a new text.", "Elasticsearch is powerful.", "Python is fun."]
knn_search.from_texts(new_texts, dims=768)
The mapping is as follows:
{
"knn_test_index_012": {
"mappings": {
"properties": {
"text": {
"type": "text"
},
"vector": {
"type": "dense_vector",
"dims": 768,
"index": true,
"similarity": "dot_product"
}
}
}
}
}
Correctly index texts after index has been created
knn_search.add_texts(texts)
The latest updates on your projects. Learn more about Vercel for Git ↗︎
1 Ignored Deployment
Name | Status | Preview | Comments | Updated (UTC) |
---|---|---|---|---|
langchain | ⬜️ Ignored (Inspect) | Jul 17, 2023 11:39pm |
@benwtrent do you have time to do another review? I think I addressed all the issues I removed ElasticKnnSearch as a subclass but tried to align the methods to be standard. I also return Document type.
I'm also not sure how I picked up 27 other files that show changing
@baskaryan Somehow I picked up 27 other files to change in this PR. Are you able to take a look? It should just be the langchain/vectorstores/elastic_vector_search.py
file
Langchain repo underwent a large reorg. Closing this PR in favor of https://github.com/langchain-ai/langchain/pull/8180