ragflow icon indicating copy to clipboard operation
ragflow copied to clipboard

[Bug]: Failed to parse document

Open auxpd opened this issue 10 months ago • 5 comments

Is there an existing issue for the same bug?

  • [X] I have checked the existing issues.

Branch name

main

Commit ID

dd7559a0096b63a9d3ffa72611d05b3538ca058a

Other environment information

No response

Actual behavior

unable to parse document. Progress remains at 0%. When checking the logs, the following error is displayed: [WARNING] [2024-04-17 17:33:21,543] [connectionpool.urlopen] [line:874]: Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x792f0c3f6e10>: Failed to establish a new connection: [Errno 111] Connection refused')': /40b2b998fc7d11eebfbc0242ac190006?location= [WARNING] [2024-04-17 17:33:24,745] [connectionpool.urlopen] [line:874]: Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x792f0c3f78d0>: Failed to establish a new connection: [Errno 111] Connection refused')': /40b2b998fc7d11eebfbc0242ac190006?location= [WARNING] [2024-04-17 17:33:26,762] [connectionpool.urlopen] [line:874]: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x792f0c42c590>: Failed to establish a new connection: [Errno 111] Connection refused')': /40b2b998fc7d11eebfbc0242ac190006?location= [WARNING] [2024-04-17 17:33:27,163] [connectionpool.urlopen] [line:874]: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x792f0c47ce50>: Failed to establish a new connection: [Errno 111] Connection refused')': /40b2b998fc7d11eebfbc0242ac190006?location= [WARNING] [2024-04-17 17:33:27,964] [connectionpool.urlopen] [line:874]: Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x793020249450>: Failed to establish a new connection: [Errno 111] Connection refused')': /40b2b998fc7d11eebfbc0242ac190006?location= [WARNING] [2024-04-17 17:33:29,566] [connectionpool.urlopen] [line:874]: Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x792f0c40b0d0>: Failed to establish a new connection: [Errno 111] Connection refused')': /40b2b998fc7d11eebfbc0242ac190006?location= [WARNING] [2024-04-17 17:33:32,768] [connectionpool.urlopen] [line:874]: Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x792f0c409b10>: Failed to establish a new connection: [Errno 111] Connection refused')': /40b2b998fc7d11eebfbc0242ac190006?location= [WARNING] [2024-04-17 17:33:34,784] [connectionpool.urlopen] [line:874]: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x792f0c294650>: Failed to establish a new connection: [Errno 111] Connection refused')': /40b2b998fc7d11eebfbc0242ac190006?location= [WARNING] [2024-04-17 17:33:35,186] [connectionpool.urlopen] [line:874]: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x792f0f2896d0>: Failed to establish a new connection: [Errno 111] Connection refused')': /40b2b998fc7d11eebfbc0242ac190006?location= [WARNING] [2024-04-17 17:33:35,987] [connectionpool.urlopen] [line:874]: Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x792f0c47d390>: Failed to establish a new connection: [Errno 111] Connection refused')': /40b2b998fc7d11eebfbc0242ac190006?location= [WARNING] [2024-04-17 17:33:37,590] [connectionpool.urlopen] [line:874]: Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x792f0c40aed0>: Failed to establish a new connection: [Errno 111] Connection refused')': /40b2b998fc7d11eebfbc0242ac190006?location= [WARNING] [2024-04-17 17:33:40,791] [connectionpool.urlopen] [line:874]: Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x792f0c409990>: Failed to establish a new connection: [Errno 111] Connection refused')': /40b2b998fc7d11eebfbc0242ac190006?location=

Expected behavior

No response

Steps to reproduce

1. upload the document.
2. click on parse.

Additional information

No response

auxpd avatar Apr 17 '24 09:04 auxpd

It's a WARNING log of minio. It may meet the limitaion of its throughput. Just ignore them.

KevinHuSh avatar Apr 17 '24 11:04 KevinHuSh

image It has been processing for 24 hours.

auxpd avatar Apr 18 '24 07:04 auxpd

i have same question too; 微信截图_20240418162759 it can't to parse the docx

HanYu666666 avatar Apr 18 '24 08:04 HanYu666666

same crazy problem: image image

yurochang avatar Apr 19 '24 08:04 yurochang

same issue for me, I opened a new issue as this one seems to be abandoned https://github.com/infiniflow/ragflow/issues/547

homegrownhrbs avatar Apr 25 '24 20:04 homegrownhrbs

Upgrade to the latest dev version of the docker image. Problem solved.

KevinHuSh avatar May 16 '24 07:05 KevinHuSh