[Bug]: when OCR large file which size > 128m,will report error
Is there an existing issue for the same bug?
- [X] I have checked the existing issues.
Branch name
any
Commit ID
any
Other environment information
No response
Actual behavior
Hi, I have set export MAX_CONTENT_LENGTH=1024288000, and the upload succeeded. But during the OCR process, it reports [ERROR] File size exceeds (<= 128Mb) and loses a lot of content. I have checked the source code, where the file size was 176,399,410 and DOC_MAXIMUM_SIZE was 1,024,288,000, def build(row): if row["size"] > DOC_MAXIMUM_SIZE: set_progress(row["id"], prog=-1, msg="File size exceeds( <= %dMb )" % (int(DOC_MAXIMUM_SIZE / 1024 / 1024))) return []
callback = partial(
set_progress,
row["id"],
row["from_page"],
row["to_page"])
chunker = FACTORY[row["parser_id"].lower()]
here file size was 176399410,DOC_MAXIMUM_SIZE was 1024288000,so it did not execute. What could be the problem? Is it an OCR limit?
Expected behavior
No response
Steps to reproduce
upload a pdf file which size > 128M
Additional information
No response
Specify MAX_CONTENT_LENGTH in docker/.env if you use docker.
I use source code,docker just start database,i have print f row["size"] > DOC_MAXIMUM_SIZE: set_progress(row["id"], prog=-1, msg="File size exceeds( <= %dMb )" % (int(DOC_MAXIMUM_SIZE / 1024 / 1024))) return []
callback = partial( set_progress, row["id"], row["from_page"], row["to_page"]) chunker = FACTORY[row["parser_id"].lower()] here,it doesn't work.MAX_CONTENT_LENGTH now 1,024,288,000,
After change .env, you need to restart the docker container.
Hi, @wingjson. Have you solved this issue? I met the same error from my end, Could you share your solution? Thanks