MinerU GPU使用率为0

Description of the bug | 错误描述

有显存占用，但是GPU使用率为0，CPU使用率很高。

2024-08-01 06:16:51.464 | INFO | magic_pdf.libs.pdf_check:detect_invalid_chars:57 - cid_count: 0, text_len: 5, cid_chars_radio: 0.0 2024-08-01 06:16:51.465 | WARNING | magic_pdf.filter.pdf_classify_by_type:classify:334 - pdf is not classified by area and text_len, by_image_area: False, by_text: False, by_avg_words: False, by_img_num: True, by_text_layout: False, by_img_narrow_strips: True, by_invalid_chars: True INFO:datasets:PyTorch version 2.3.1 available. 2024-08-01 06:17:00.236 | INFO | magic_pdf.model.pdf_extract_kit:init:99 - DocAnalysis init, this may take some times. apply_layout: True, apply_formula: True, apply_ocr: True 2024-08-01 06:17:00.236 | INFO | magic_pdf.model.pdf_extract_kit:init:107 - using device: cuda 2024-08-01 06:17:00.236 | INFO | magic_pdf.model.pdf_extract_kit:init:109 - using models_dir: /root/dataDisk/mineru/wanderkid/PDF-Extract-Kit/models CustomVisionEncoderDecoderModel init CustomMBartForCausalLM init CustomMBartDecoder init [08/01 06:17:34 detectron2]: Rank of current process: 0. World size: 1 cuobjdump info : File '/root/.local/lib/python3.10/site-packages/detectron2/_C.cpython-310-x86_64-linux-gnu.so' does not contain device code [08/01 06:17:35 detectron2]: Environment info:

sys.platform linux Python 3.10.14 (main, May 6 2024, 19:42:50) [GCC 11.2.0] numpy 1.26.4 detectron2 0.6 @/root/.local/lib/python3.10/site-packages/detectron2 detectron2._C not built correctly: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /root/.local/lib/python3.10/site-packages/detectron2/_C.cpython-310-x86_64-linux-gnu.so) Compiler ($CXX) c++ (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0 CUDA compiler Build cuda_12.2.r12.2/compiler.33191640_0 detectron2 arch flags /root/.local/lib/python3.10/site-packages/detectron2/_C.cpython-310-x86_64-linux-gnu.so DETECTRON2_ENV_MODULE PyTorch 2.3.1+cu121 @/root/.local/lib/python3.10/site-packages/torch PyTorch debug build False torch._C._GLIBCXX_USE_CXX11_ABI False GPU available Yes GPU 0 NVIDIA GeForce RTX 4090 (arch=8.9) Driver version 535.129.03 CUDA_HOME /usr/local/cuda Pillow 10.4.0 torchvision 0.18.1+cu121 @/root/.local/lib/python3.10/site-packages/torchvision torchvision arch flags 5.0, 6.0, 7.0, 7.5, 8.0, 8.6, 9.0 fvcore 0.1.5.post20221221 iopath 0.1.9 cv2 4.6.0

How to reproduce the bug | 如何复现

import os import json

from loguru import logger

from magic_pdf.pipe.UNIPipe import UNIPipe from magic_pdf.rw.DiskReaderWriter import DiskReaderWriter

import magic_pdf.model as model_config model_config.use_inside_model = True

try: current_script_dir = os.path.dirname(os.path.abspath(file)) demo_name = "./files/new/xxx" pdf_path = os.path.join(current_script_dir, f"{demo_name}.pdf") pdf_bytes = open(pdf_path, "rb").read() # model_path = os.path.join(current_script_dir, f"{demo_name}.json") # model_json = json.loads(open(model_path, "r", encoding="utf-8").read()) model_json = [] # model_json传空list使用内置模型解析 jso_useful_key = {"_pdf_type": "", "model_list": model_json} local_image_dir = os.path.join(current_script_dir, 'images') image_dir = str(os.path.basename(local_image_dir)) image_writer = DiskReaderWriter(local_image_dir) pipe = UNIPipe(pdf_bytes, jso_useful_key, image_writer) pipe.pipe_classify() """如果没有传入有效的模型数据，则使用内置model解析""" if len(model_json) == 0: if model_config.use_inside_model: pipe.pipe_analyze() else: logger.error("need model list input") exit(1) pipe.pipe_parse() md_content = pipe.pipe_mk_markdown(image_dir, drop_mode="none") with open(f"{demo_name}.md", "w", encoding="utf-8") as f: f.write(md_content) except Exception as e: logger.exception(e)

[08/01 06:17:38 d2.checkpoint.detection_checkpoint]: [DetectionCheckpointer] Loading from /root/dataDisk/mineru/wanderkid/PDF-Extract-Kit/models/Layout/model_final.pth ... [08/01 06:17:38 fvcore.common.checkpoint]: [Checkpointer] Loading from /root/dataDisk/mineru/wanderkid/PDF-Extract-Kit/models/Layout/model_final.pth ... 2024-08-01 06:17:40.010 | INFO | magic_pdf.model.pdf_extract_kit:init:132 - DocAnalysis init done! 2024-08-01 06:17:40.010 | INFO | magic_pdf.model.doc_analyze_by_custom_model:custom_model_init:92 - model init cost: 48.544782400131226 2024-08-01 06:17:42.132 | INFO | magic_pdf.model.pdf_extract_kit:call:143 - layout detection cost: 1.61

0: 1888x1344 (no detections), 170.4ms Speed: 20.8ms preprocess, 170.4ms inference, 0.9ms postprocess per image at shape (1, 3, 1888, 1344) 2024-08-01 06:17:43.157 | INFO | magic_pdf.model.pdf_extract_kit:call:173 - formula nums: 0, mfr time: 0.0 2024-08-01 06:17:43.166 | INFO | magic_pdf.model.pdf_extract_kit:call:250 - ocr cost: 0.01 2024-08-01 06:17:43.559 | INFO | magic_pdf.model.pdf_extract_kit:call:143 - layout detection cost: 0.39

0: 1888x1344 (no detections), 25.4ms Speed: 20.4ms preprocess, 25.4ms inference, 0.7ms postprocess per image at shape (1, 3, 1888, 1344) 2024-08-01 06:17:43.607 | INFO | magic_pdf.model.pdf_extract_kit:call:173 - formula nums: 0, mfr time: 0.0 2024-08-01 06:22:25.789 | INFO | magic_pdf.model.pdf_extract_kit:call:250 - ocr cost: 282.18 2024-08-01 06:22:26.237 | INFO | magic_pdf.model.pdf_extract_kit:call:143 - layout detection cost: 0.45

0: 1888x1344 (no detections), 26.1ms Speed: 23.6ms preprocess, 26.1ms inference, 0.7ms postprocess per image at shape (1, 3, 1888, 1344) 2024-08-01 06:22:26.292 | INFO | magic_pdf.model.pdf_extract_kit:call:173 - formula nums: 0, mfr time: 0.0 2024-08-01 06:28:37.185 | INFO | magic_pdf.model.pdf_extract_kit:call:250 - ocr cost: 370.89 2024-08-01 06:28:37.761 | INFO | magic_pdf.model.pdf_extract_kit:call:143 - layout detection cost: 0.57

0: 1888x1344 (no detections), 25.4ms Speed: 25.1ms preprocess, 25.4ms inference, 0.8ms postprocess per image at shape (1, 3, 1888, 1344) 2024-08-01 06:28:37.817 | INFO | magic_pdf.model.pdf_extract_kit:call:173 - formula nums: 0, mfr time: 0.0 2024-08-01 06:34:56.178 | INFO | magic_pdf.model.pdf_extract_kit:call:250 - ocr cost: 378.36 2024-08-01 06:34:56.578 | INFO | magic_pdf.model.pdf_extract_kit:call:143 - layout detection cost: 0.4

0: 1888x1344 (no detections), 27.4ms Speed: 20.6ms preprocess, 27.4ms inference, 0.7ms postprocess per image at shape (1, 3, 1888, 1344) 2024-08-01 06:34:56.632 | INFO | magic_pdf.model.pdf_extract_kit:call:173 - formula nums: 0, mfr time: 0.0 2024-08-01 06:40:20.274 | INFO | magic_pdf.model.pdf_extract_kit:call:250 - ocr cost: 323.64 2024-08-01 06:40:20.275 | INFO | magic_pdf.model.doc_analyze_by_custom_model:doc_analyze:118 - doc analyze cost: 1359.75603723526

Operating system | 操作系统

Linux

Python version | Python 版本

3.10

Software version | 软件版本 (magic-pdf --version)

0.6.x

Device mode | 设备模式

cuda

Aug 01 '24 06:08 thorory

目前的版本因为兼容性问题，在layout识别和公式检测上使用了cuda加速，ocr识别是通过cpu做的，所以可以在log中看到layout和公式解析速度很快，而ocr很慢的情况。在下个版本中，我们调整了兼容性配置，使ocr也能通过cuda加速，相信可以解决你的问题。

Aug 01 '24 06:08 myhloli

@thorory https://github.com/opendatalab/MinerU/blob/master/docs/README_Ubuntu_CUDA_Acceleration_zh_CN.md 参考这个文档，为ocr开启cuda加速即可

Aug 02 '24 06:08 myhloli

谢谢，此问题已解决。

Aug 02 '24 09:08 thorory