ragflow
ragflow copied to clipboard
[Bug]: The size of PDF chunks is not limited, which can lead to extremely large chunks that exceed the limitations of the LLM in extreme cases.
Is there an existing issue for the same bug?
- [X] I have checked the existing issues.
Branch name
main
Commit ID
4cda40c
Other environment information
No response
Actual behavior
chunk limited to 256
Expected behavior
No response
Steps to reproduce
upload and parse a pdf with tabl and picture.
Additional information
No response
What is the chunking method you refer to?
If you're not using 'One', you can control chunk size limitation by task page size here:
It limits the page size for one task.
A PDF file is parsed and splited to chunks, each chunk's size should be limited.
A PDF file is parsed and splited to chunks, each chunk's size should be limited.
I see and agree.
Each chunk of a pdf is limited to 12 pages by default.