[Bug]: AttributeError: 'Pdf' object has no attribute 'page_chars'
Is there an existing issue for the same bug?
- [X] I have checked the existing issues.
Branch name
V0.13.0
Commit ID
1d0a560
Other environment information
NAME="CentOS Stream"
VERSION="9"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="9"
PLATFORM_ID="platform:el9"
PRETTY_NAME="CentOS Stream 9"
ANSI_COLOR="0;31"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:centos:centos:9"
HOME_URL="https://centos.org/"
BUG_REPORT_URL="https://issues.redhat.com/"
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux 9"
REDHAT_SUPPORT_PRODUCT_VERSION="CentOS Stream"
Actual behavior
Expected behavior
Uploading PDF files to build a knowledge base should not throw errors.
Steps to reproduce
1. Start ragflow from source code
2. Create an empty knowledge base
3. Upload PDF files to build the knowledge base
Additional information
Traceback (most recent call last): File "/home/shiyalun/project/opensource/ragflow/rag/svr/task_executor.py", line 174, in build cks = chunker.chunk(row["name"], binary=binary, from_page=row["from_page"], ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/shiyalun/project/opensource/ragflow/rag/app/qa.py", line 356, in chunk qai_list, tbls = pdf_parser(filename if not binary else binary, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/shiyalun/project/opensource/ragflow/rag/app/qa.py", line 77, in call self.images( File "/home/shiyalun/project/opensource/ragflow/deepdoc/parser/pdf_parser.py", line 989, in images range(len(self.page_chars))] ^^^^^^^^^^^^^^^ AttributeError: 'Pdf' object has no attribute 'page_chars'. Did you mean: 'page_images'?
Could you attach the PDF file, if it's convenient for you?
I upgraded pdfplumber==0.11.1 and it worked out.