[Bug]: Python kernel crashes when using Chroma's from_texts
What happened?
When using the from_texts method of Chroma, the Python kernel crashes without any error messages. The process finishes with exit code -1073741819 (0xC0000005). This issue occurs consistently and makes it impossible to use this method effectively. I am using the latest version of Chroma and have tried on different environments but still encounter the same problem. Any help or suggestions to resolve this issue would be greatly appreciated.
Versions
System Information
OS: Windows OS Version: 10.0.22631 Python Version: 3.12.4 | packaged by Anaconda, Inc. | (main, Jun 18 2024, 15:03:56) [MSC v.1929 64 bit (AMD64)]
Package Information
langchain_core: 0.3.5 langchain: 0.3.0 langchain_community: 0.3.0 langsmith: 0.1.125 langchain_experimental: 0.3.0 langchain_huggingface: 0.1.0 langchain_text_splitters: 0.3.0
Optional packages not installed
langgraph langserve
Other Dependencies
aiohttp: 3.9.5 async-timeout: Installed. No version info available. dataclasses-json: 0.6.7 httpx: 0.27.0 huggingface-hub: 0.24.5 jsonpatch: 1.33 numpy: 1.26.4 orjson: 3.10.6 packaging: 23.2 pydantic: 2.8.2 pydantic-settings: 2.5.2 PyYAML: 6.0.1 requests: 2.32.2 sentence-transformers: 3.0.1 SQLAlchemy: 2.0.30 tenacity: 8.5.0 tokenizers: 0.19.1 transformers: 4.44.0 typing-extensions: 4.11.0
Relevant log output
from langchain_community.vectorstores import Chroma
embed_model_path = '.././AI-ModelScope/bge-small-en-v1___5'
from langchain_huggingface import HuggingFaceEmbeddings
embedding = HuggingFaceEmbeddings(model_name=embed_model_path)
texts = [
"Test"
]
try:
smalldb_chinese = Chroma.from_texts(texts, embedding=embedding)
except Exception as r:
print('%s' %(r))
Process finished with exit code -1073741819 (0xC0000005)
@Kviilen, recently we've observed kernel crashes on Windows system. One of the workarounds was to update your system with the latest patches from Microsoft and use python 3.10. Alternatively users have reported that running 0.5.3 also solved the problem (although I wouldn't recommend downgrading).
I have located that the code at line 382 in the file embeddings_queue.py: if len(filtered_embeddings) > 0: sub.callback(filtered_embeddings) will cause the Python kernel to crash.
thanks @Kviilen I was able to test chroma on local by both downgrading the chroma. Another way of lowering python version to 3.10.
However, the query results are not clear to me. The query is showing results (documents and scores) of completely unrelated query term, which i fail to infer or understand. Documentation is also providing the clear behaviour
Tracking Windows crashes in #2513