langchain icon indicating copy to clipboard operation
langchain copied to clipboard

Import error and undefined symbol

Open Mahathi-Bhagavatula opened this issue 1 year ago • 3 comments

Hi, when I am trying to index the documents using cromadb, I am getting the following error. When looked into it, understood it is the compatibility isssue. But couldn't exactly find what packages are the hnswlib compatible with.

ImportError: /anaconda3/envs/myenv/lib/python3.9/site-packages/hnswlib.cpython-39-x86_64-linux-gnu.so: undefined symbol: _ZNSt15__exception_ptr13exception_ptr10_M_releaseEv

Mahathi-Bhagavatula avatar Apr 17 '23 13:04 Mahathi-Bhagavatula

Same issue

tomconversion avatar May 02 '23 02:05 tomconversion

same issue

sinia avatar May 04 '23 16:05 sinia

~~I finally got it running by installing hnswlib with conda: conda install -c conda-forge hnswlib~~

Scratch that!

Using conda-forge installed the wrong version of hnswlib (0.6.2) which is incompatible with the required chromadb.

I removed the conda-installed hnswlib and followed the instructions from https://github.com/chroma-core/chroma/issues/538#issuecomment-1546750511

namely:

pip install hnswlib --user --no-build-isolation
pip install chromadb --user

dswah avatar May 19 '23 10:05 dswah

Same issue here as I was following the LangChain step by step tutorial about QA over unstructured data.

I am working locally on VSCode in a Jupyter notebook in a venv where langchain, openai, chromadb, unstructured and ipykernel has been installed via pip).

This issue appears in step 3 when executing the last line :

# Step 1 : Load
from langchain.document_loaders import UnstructuredPDFLoader
loader = UnstructuredPDFLoader(PDF_FILE_PATH)
data = loader.load()

# Step 2 : Split
from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size = 500, chunk_overlap = 0)
all_splits = text_splitter.split_documents(data)

# Step 3 : Store
%env OPENAI_API_KEY=my-secret-value
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
vectorstore = Chroma.from_documents(documents=all_splits, embedding=OpenAIEmbeddings())

Results : image

I don't think this has something to deal with Jupyter notebook since when opening a python3 CLI in the terminal, I have the same error : image

I've tried using a Poetry virtual env instead, but didn't change anything.

I'd love to discover LangChain, so any idea on how to solve this will be appreciated, thanks ! :heart:

hvassard avatar Aug 01 '23 15:08 hvassard

This problem disappeard, but I'm not sure weather this is due to an upgrade of langchain, or the installation of pdfminer-six as explain in this stack overflow post

hvassard avatar Aug 10 '23 08:08 hvassard

Hi, @Mahathi-Bhagavatula! I'm Dosu, and I'm here to help the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

Based on my understanding, you were experiencing an import error and undefined symbol when trying to index documents using cromadb. It seems that other users, such as tomconversion, sinia, and dswah, have also encountered the same issue. dswah found a solution by installing hnswlib with pip instead of conda. hvassard faced a similar problem while following a tutorial and resolved it by upgrading langchain or installing pdfminer-six.

Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.

Thank you for your understanding and contribution to the LangChain project!

dosubot[bot] avatar Nov 09 '23 16:11 dosubot[bot]