Langchain-Chatchat icon indicating copy to clipboard operation
Langchain-Chatchat copied to clipboard

pdf加载失败

Open yang0 opened this issue 2 years ago • 2 comments
trafficstars

e:\a.txt加载成功了,e:\a.pdf加载就失败,pdf文件里面前面几页是图片,后面都是文字,加载失败没有报更多错误,请问该怎么排查?

yang0 avatar Apr 15 '23 07:04 yang0

可以直接在本地使用 langchain 的 UnstructuredFileLoader 尝试对pdf文件进行加载,调试看看什么问题

杨凌 @.***>于2023年4月15日 周六15:26写道:

e:\a.txt加载成功了,e:\a.pdf加载就失败,pdf文件里面前面几页是图片,后面都是文字,加载失败没有报更多错误,请问该怎么排查?

— Reply to this email directly, view it on GitHub https://github.com/imClumsyPanda/langchain-ChatGLM/issues/105, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLH5EQNNNV7QBH2XM5UCZLXBJES5ANCNFSM6AAAAAAW7HWCTY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

imClumsyPanda avatar Apr 15 '23 07:04 imClumsyPanda

installing.rst 可能是一些库没安装,pdf至少需要libmagic-dev poppler-utils tesseract-ocr detectron2

lurenlym avatar Apr 17 '23 06:04 lurenlym

由于该issue长期不活跃,开发组将其关闭,如果有需求可以重新提起

zRzRzRzRzRzRzR avatar Sep 27 '23 12:09 zRzRzRzRzRzRzR