ragflow [Bug]: when OCR large file which size

Is there an existing issue for the same bug?

[X] I have checked the existing issues.

Branch name

any

Commit ID

any

Other environment information

No response

Actual behavior

Hi, I have set export MAX_CONTENT_LENGTH=1024288000, and the upload succeeded. But during the OCR process, it reports [ERROR] File size exceeds (<= 128Mb) and loses a lot of content. I have checked the source code, where the file size was 176,399,410 and DOC_MAXIMUM_SIZE was 1,024,288,000, def build(row): if row["size"] > DOC_MAXIMUM_SIZE: set_progress(row["id"], prog=-1, msg="File size exceeds( <= %dMb )" % (int(DOC_MAXIMUM_SIZE / 1024 / 1024))) return []

callback = partial(
    set_progress,
    row["id"],
    row["from_page"],
    row["to_page"])
chunker = FACTORY[row["parser_id"].lower()]

here file size was 176399410,DOC_MAXIMUM_SIZE was 1024288000,so it did not execute. What could be the problem? Is it an OCR limit?

Expected behavior

No response

Steps to reproduce

upload a pdf file which size > 128M

Additional information

No response

Aug 21 '24 06:08 wingjson

Specify MAX_CONTENT_LENGTH in docker/.env if you use docker.

Aug 22 '24 01:08 KevinHuSh

I use source code,docker just start database,i have print f row["size"] > DOC_MAXIMUM_SIZE: set_progress(row["id"], prog=-1, msg="File size exceeds( <= %dMb )" % (int(DOC_MAXIMUM_SIZE / 1024 / 1024))) return []

callback = partial( set_progress, row["id"], row["from_page"], row["to_page"]) chunker = FACTORY[row["parser_id"].lower()] here,it doesn't work.MAX_CONTENT_LENGTH now 1,024,288,000,

Aug 22 '24 02:08 wingjson

After change .env, you need to restart the docker container.

Sep 09 '24 05:09 KevinHuSh

Hi, @wingjson. Have you solved this issue? I met the same error from my end, Could you share your solution? Thanks

Oct 22 '25 02:10 HammerMax

[Bug]: when OCR large file which size > 128m,will report error

Is there an existing issue for the same bug?

Branch name

Commit ID

Other environment information

Actual behavior

Expected behavior

Steps to reproduce

Additional information