FastGPT icon indicating copy to clipboard operation
FastGPT copied to clipboard

pdf导入知识库,启用PDF增强后第一次报错,只有第二次运行才可以成功

Open goactiongo opened this issue 7 months ago • 2 comments

4.9.7-fix2 pdf导入知识库,启用PDF增强后第一次报错 Image 点击上一步,然后再点击下一步就成功

Image

pdf_maker日志如下,第一次报错AttributeError: 'NoneType' object has no attribute 'split'

Recognizing layout: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 10.05it/s]
Running OCR Error Detection: 100%|██████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 165.64it/s]
Detecting bboxes: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  7.23it/s]
Detecting bboxes: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  5.41it/s]
Recognizing Text:   0%|                                                                                                              | 0/1 [00:00<?, ?it/s]
INFO:     172.22.1.39:54548 - "POST /v2/parse/file HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/opt/conda/lib/python3.11/site-packages/uvicorn/protocols/http/h11_impl.py", line 403, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
    return await self.app(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in __call__
    await super().__call__(scope, receive, send)
  File "/opt/conda/lib/python3.11/site-packages/starlette/applications.py", line 112, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/opt/conda/lib/python3.11/site-packages/starlette/middleware/errors.py", line 187, in __call__
    raise exc
  File "/opt/conda/lib/python3.11/site-packages/starlette/middleware/errors.py", line 165, in __call__
    await self.app(scope, receive, _send)
  File "/opt/conda/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "/opt/conda/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    raise exc
  File "/opt/conda/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
    await app(scope, receive, sender)
  File "/opt/conda/lib/python3.11/site-packages/starlette/routing.py", line 714, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/opt/conda/lib/python3.11/site-packages/starlette/routing.py", line 734, in app
    await route.handle(scope, receive, send)
  File "/opt/conda/lib/python3.11/site-packages/starlette/routing.py", line 288, in handle
    await self.app(scope, receive, send)
  File "/opt/conda/lib/python3.11/site-packages/starlette/routing.py", line 76, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "/opt/conda/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    raise exc
  File "/opt/conda/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
    await app(scope, receive, sender)
  File "/opt/conda/lib/python3.11/site-packages/starlette/routing.py", line 73, in app
    response = await f(request)
               ^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/fastapi/routing.py", line 301, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/fastapi/routing.py", line 212, in run_endpoint_function
    return await dependant.call(**values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/marker2_mp.py", line 181, in process_pdfs
    md_content_with_base64_images = embed_images_as_base64(results[0].get("text"), results[0].get("output_path"))
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/marker2_mp.py", line 144, in embed_images_as_base64
    lines = md_content.split('\n')
            ^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'split'

第二次运行成功

Recognizing layout: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 10.44it/s]
Running OCR Error Detection: 100%|██████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 174.86it/s]
Detecting bboxes: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  7.16it/s]
Detecting bboxes: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  5.78it/s]
Recognizing Text: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  1.74it/s]
Recognizing tables: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  2.46it/s]
INFO:     172.22.1.39:54570 - "POST /v2/parse/file HTTP/1.1" 200 OK

goactiongo avatar May 13 '25 02:05 goactiongo

文档发上来看看

YYH211 avatar May 19 '25 03:05 YYH211

与文档没关系,所有pdf文件都这样

---原始邮件--- 发件人: @.> 发送时间: 2025年5月19日(周一) 中午11:56 收件人: @.>; 抄送: @.@.>; 主题: Re: [labring/FastGPT] pdf导入知识库,启用PDF增强后第一次报错,只有第二次运行才可以成功 (Issue #4795)

YYH211 left a comment (labring/FastGPT#4795)

文档发上来看看

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

goactiongo avatar May 19 '25 07:05 goactiongo