Embedding Function not properly passed to Chroma Collection
https://discord.com/channels/1038097195422978059/1038097349660135474/1082685778582310942
details are in the discord thread.
the code change here changed how chroma handles embedding functions and it seems like ours is being sent as None for some reason.
Seeing this too. Version 0.0.102.
Code:
from langchain.document_loaders import TextLoader
from langchain.embeddings import TensorflowHubEmbeddings
loader = TextLoader('./state_of_the_union.txt')
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_documents(documents)
embeddings = TensorflowHubEmbeddings()
docsearch = Chroma.from_documents(texts, embeddings)
Error:
File /usr/local/lib/python3.10/dist-packages/langchain/vectorstores/chroma.py:67, in Chroma.__init__(self, collection_name, embedding_function, persist_directory)
62 logger.warning(
63 f"Collection {collection_name} already exists,"
64 " Do you have the right embedding function?"
65 )
66 else:
---> 67 self._collection = self._client.create_collection(
68 name=collection_name,
69 embedding_function=self._embedding_function.embed_documents
70 if self._embedding_function is not None
71 else None,
72 )
TypeError: LocalAPI.create_collection() got an unexpected keyword argument 'embedding_function'
I believe this should fix it. https://github.com/hwchase17/langchain/pull/1444
I believe this should fix it. #1444
This works, thank you!
#1444 was merged in - is this resolved in 0.0.107?
I still seem to be having this issue, has it been resolved?
I haven't seen this issue since the aforementioned merge commit.
I still get the same error.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
[<ipython-input-26-13b99c9bccce>](https://localhost:8080/#) in <cell line: 5>()
3 text_field = "text"
4 index = pinecone.Index(index_name)
----> 5 vectorstore = Pinecone(
6 index=index,
7 embedding_function=embed.embed_query,
TypeError: Pinecone.__init__() got an unexpected keyword argument 'embedding_function'
for this code
from langchain.vectorstores import Pinecone
text_field = "text"
index = pinecone.Index(index_name)
vectorstore = Pinecone(
index=index,
embedding_function=embed.embed_query,
text_key=text_field)
query = "foo"
top = vectorstore.similarity_search(query, k=3)
Hi, @KMontag42! I'm Dosu, and I'm here to help the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.
From what I understand, you reported an issue with the recent code change in Chroma Collection where embedding functions are being sent as None. This issue was confirmed by keviddles and harithzulfaizal. A fix was proposed in pull request #1444 by timothyasp and merged. However, it seems that the fix did not completely resolve the issue as reported by sedgar03 and eWizardII.
To help us keep track of the current status, could you please let us know if this issue is still relevant to the latest version of the LangChain repository? If it is, please comment on the issue and let us know. Otherwise, feel free to close the issue yourself or it will be automatically closed in 7 days.
Thank you for your understanding and cooperation. If you have any further questions or concerns, please don't hesitate to reach out.