langchain
langchain copied to clipboard
TypeError: similarity_search_with_score_by_vector() got an unexpected keyword argument 'score_threshold'
System Info
python 3.9
current version
Who can help?
@agola11 @eyurtsev @hwchase17
Information
- [X] The official example notebooks/scripts
- [ ] My own modified scripts
Related Components
- [ ] LLMs/Chat Models
- [X] Embedding Models
- [ ] Prompts / Prompt Templates / Prompt Selectors
- [ ] Output Parsers
- [ ] Document Loaders
- [X] Vector Stores / Retrievers
- [ ] Memory
- [ ] Agents / Agent Executors
- [ ] Tools / Toolkits
- [ ] Chains
- [ ] Callbacks/Tracing
- [ ] Async
Reproduction
from langchain.embeddings import LlamaCppEmbeddings
from langchain.embeddings import HuggingFaceEmbeddings, SentenceTransformerEmbeddings
from langchain.embeddings import OpenAIEmbeddings
import json
from langchain.retrievers import SVMRetriever
embeddings = LlamaCppEmbeddings(model_path="ggml-model-q4_0.bin")
from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings
text_list = ['The first Nobel Prize in Physics was awarded in 1901 to Wilhelm Conrad R\u00f6ntgen \"for his discovery of the remarkable rays subsequently named after him\".',
#'The Nobel Prize in Physics is a yearly award given by the Royal Swedish Academy of Sciences for those who have made the most outstanding contributions for mankind in the field of physics. It is one of the five Nobel Prizes established by the 1895 will of Alfred Nobel, which are awarded for outstanding contributions in chemistry, physiology or medicine, literature, and physics. These prizes are awarded in Stockholm, Sweden. The first Nobel Prize in Physics was awarded to Wilhelm R\u00f6ntgen in 1901.',
#'The next Deadpool movie is set to be released on June 1, 2018.'
]
#print(documents)
db = FAISS.from_texts(text_list, embeddings)
retriever = db.as_retriever(search_type="similarity_score_threshold", search_kwargs={"score_threshold": .5})
docs = retriever.get_relevant_documents("who got the first nobel prize in physics")
print(docs)
Expected behavior
Traceback (most recent call last):
File "llama_index/l.py", line 56, in <module>
docs = retriever.get_relevant_documents("who got the first nobel prize in physics")
File "/scratch/c7031420/.conda/envs/langchain/lib/python3.9/site-packages/langchain/vectorstores/base.py", line 395, in get_relevant_documents
self.vectorstore.similarity_search_with_relevance_scores(
File ".conda/envs/langchain/lib/python3.9/site-packages/langchain/vectorstores/base.py", line 141, in similarity_search_with_relevance_scores
docs_and_similarities = self._similarity_search_with_relevance_scores(
File "/.conda/envs/langchain/lib/python3.9/site-packages/langchain/vectorstores/faiss.py", line 609, in _similarity_search_with_relevance_scores
docs_and_scores = self.similarity_search_with_score(
File "/.conda/envs/langchain/lib/python3.9/site-packages/langchain/vectorstores/faiss.py", line 245, in similarity_search_with_score
docs = self.similarity_search_with_score_by_vector(
TypeError: similarity_search_with_score_by_vector() got an unexpected keyword argument 'score_threshold'
yes faced same issue.
retriever = db.as_retriever(search_type="similarity_score_threshold", search_kwargs={"score_threshold": .5})
this does not work. use this
search_result = vector_store.similarity_search_with_score(query, k=30)
search_result
Hi, @abdoelsayed2016! I'm Dosu, and I'm here to help the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.
From what I understand, you encountered a TypeError when using the similarity_search_with_score_by_vector() function with the 'score_threshold' keyword argument. SDcodehub suggested an alternative approach to resolve the issue.
Before we close this issue, could you please confirm if it is still relevant to the latest version of the LangChain repository? If it is, please let the LangChain team know by commenting on this issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.
Thank you for your understanding and contribution to the LangChain project!
@abdoelsayed2016 - Were you able to resolve this issue? I have the same problem. The code was working just fine before upgrading to langchain v0.1. It only works if I comment out the line with search_kwargs:
retriever = client.as_retriever(
search_type=search_type,
# search_kwargs=kwargs,
)
Same error here with Chroma.
Here is my libs version :
langchain 0.2.1
langchain-chroma 0.1.1
langchain-community 0.2.1
langchain-core 0.2.1
langchain-openai 0.1.7
langchain-text-splitters 0.2.0
langsmith 0.1.63
This is the traceback :
May 28 13:39:30 ip-172-31-43-79 gunicorn[159355]: File "/home/ubuntu/chatbot/env/lib/python3.10/site-packages/langchain_core/retrievers.py", line 316, in get_relevant_documents
May 28 13:39:30 ip-172-31-43-79 gunicorn[159355]: result = self._get_relevant_documents(
May 28 13:39:30 ip-172-31-43-79 gunicorn[159355]: File "/home/ubuntu/chatbot/env/lib/python3.10/site-packages/langchain_core/vectorstores.py", line 696, in _get_relevant_documents
May 28 13:39:30 ip-172-31-43-79 gunicorn[159355]: docs = self.vectorstore.similarity_search(query, **self.search_kwargs)
May 28 13:39:30 ip-172-31-43-79 gunicorn[159355]: File "/home/ubuntu/chatbot/env/lib/python3.10/site-packages/langchain_community/vectorstores/chroma.py", line 349, in similarity_search
May 28 13:39:30 ip-172-31-43-79 gunicorn[159355]: docs_and_scores = self.similarity_search_with_score(
May 28 13:39:30 ip-172-31-43-79 gunicorn[159355]: File "/home/ubuntu/chatbot/env/lib/python3.10/site-packages/langchain_community/vectorstores/chroma.py", line 439, in similarity_search_with_score
May 28 13:39:30 ip-172-31-43-79 gunicorn[159355]: results = self.__query_collection(
May 28 13:39:30 ip-172-31-43-79 gunicorn[159355]: File "/home/ubuntu/chatbot/env/lib/python3.10/site-packages/langchain_core/utils/utils.py", line 36, in wrapper
May 28 13:39:30 ip-172-31-43-79 gunicorn[159355]: return func(*args, **kwargs)
May 28 13:39:30 ip-172-31-43-79 gunicorn[159355]: File "/home/ubuntu/chatbot/env/lib/python3.10/site-packages/langchain_community/vectorstores/chroma.py", line 156, in __query_collection
May 28 13:39:30 ip-172-31-43-79 gunicorn[159355]: return self._collection.query(
May 28 13:39:30 ip-172-31-43-79 gunicorn[159355]: TypeError: Collection.query() got an unexpected keyword argument 'score_threshold'
After some investigation, it appears that we must set search_type='similarity_score_threshold' to use score_threshold, which doesn't supported by default search_type='similarity' anymore.