ModuleNotFoundError: No module named 'langchain.docstore'
🔎 Search before asking
- [x] I have searched the PaddleOCR Docs and found no similar bug report.
- [x] I have searched the PaddleOCR Issues and found no similar bug report.
- [x] I have searched the PaddleOCR Discussions and found no similar bug report.
🐛 Bug (问题描述)
Ubuntu22.04环境通过docker方式安装paddlepaddle:
docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddle:3.2.0
然后启动并进入容器:
docker run --name paddle -it -v $PWD:/paddle ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddle:3.2.0 /bin/bash
在容器内安装ocr
python -m pip install "paddleocr[all]"
安装完成后在容器内使用文档中的命令行方式进行推理的例子:
paddleocr doc_parser -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/paddleocr_vl_demo.png
出现如下错误:
Traceback (most recent call last):
File "/usr/local/bin/paddleocr", line 5, in
🏃♂️ Environment (运行环境)
OS ubuntu22.04
CPU I5-7400 x86_64
🌰 Minimal Reproducible Example (最小可复现问题的Demo)
docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddle:3.2.0
docker run --name paddle -it -v $PWD:/paddle ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddle:3.2.0 /bin/bash
python -m pip install "paddleocr[all]"
paddleocr doc_parser -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/paddleocr_vl_demo.png
可能安装的版本不对,建议按照文档重新执行一遍安装流程试试看。如果依然有问题,我们会及时的跟进解决。
same here. would recommend to pin dependency versions.
Change Langchain version <1.0.0
我的也跑不起来
C:\Users\24723\Desktop\test>paddleocr text_detection -i 1.png
Traceback (most recent call last):
File "
Change Langchain version <1.0.0
@as882301
更改 Langchain 版本 <1.0.0
换了个版本果然可以 感谢大佬
请问你用的哪个版本? 安装 Langchain时,langchain-community需要额外安装吗?
更改 Langchain 版本 <1.0.0
换了个版本果然可以 感谢大佬
请问你用的哪个版本? 安装 Langchain时,langchain-community需要额外安装吗?
我就只切换了个版本就可以跑了其他的没动 切到的版本是 Version: 0.3.27
更改 Langchain 版本 <1.0.0
换了个版本果然可以 感谢大佬
请问你用的哪个版本? 安装 Langchain时,langchain-community需要额外安装吗?
我就只切换了个版本就可以跑了其他的没动 切到的版本是 Version: 0.3.27
没毛病
I also updated to langchain v1.0 and the issue was found. The solution here does not require to revert langchain back to older version.
Error:
File "../venv/lib/python3.13/site-packages/paddlex/inference/pipelines/components/retriever/base.py", line 25, in
Solution
I updated just two lines in the file base.py as referenced above
if is_dep_available("langchain"):
from langchain.docstore.document import Document
from langchain.text_splitter import RecursiveCharacterTextSplitter
WITH
if is_dep_available("langchain"):
from langchain_core.documents import Document
from langchain_text_splitters import RecursiveCharacterTextSplitter
This is done because, the updated code for importing Document and Text Splitter in modern versions of LangChain (the ones compatible with LangChain 1.x.x) is:
from langchain_core.documents import Document
from langchain_text_splitters import RecursiveCharacterTextSplitter
Using paddleocr in a repository which already migrated to langchain >1.0 is not possible right now.
Is there any update on this? Ideally, the dependencies could be decoupled a bit, because for just using some standard OCR model without paddleocr's built-in LLM features, langchain should not be required I guess?