PaddleX 使用PP-StructureV3产线出现内存滞留现象

使用Paddlex PP-StructureV3产线解析pdf文档时发现显存占用持续升高且不释放现象，最终会导致显存溢出，增加paddle.device.cuda.empty_cache()主动释放显存未起作用。通过修改产线yaml文件中的limit_type为max并设置limit_side_len为合理值，可以避免显存升高直到溢出，但是也依然有显存持续升高且不释放的问题，想问下这个是什么原因？如何正确释放显存？

Aug 26 '25 06:08 loneWolf1127

请问测试的时候是将同一个文件反复输入吗，还是说使用的是不同的文件

Aug 26 '25 10:08 Bobholamovic

请问测试的时候是将同一个文件反复输入吗，还是说使用的是不同的文件

不同文件

Aug 28 '25 02:08 loneWolf1127

那么在限制图片最大尺寸后，是否有观察到在处理一定数量的文件后，显存不再上升的情况？还是说，在整个处理过程中，显存始终在不断上升。

Aug 28 '25 03:08 Bobholamovic

我也遇到了这个问题，5090D显卡，使用Paddlex PP-StructureV3产线解析pdf以及图片，显存占用持续升高，分批次对server发了60个文件测试最终显存占到了如下，但是OCR结束后也无法释放。

|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 5090 D      Off |   00000000:01:00.0 Off |                  N/A |
| 31%   41C    P8             22W /  600W |   28927MiB /  32607MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

Aug 29 '25 18:08 SuiyueYoung

那么在限制图片最大尺寸后，是否有观察到在处理一定数量的文件后，显存不再上升的情况？还是说，在整个处理过程中，显存始终在不断上升。

整个处理过程中，显存始终在不断上升(限制图片最大尺寸后显存上升变慢)

Sep 02 '25 02:09 loneWolf1127

请提供使用的配置和最小可复现脚本，我们这边尝试复现一下。

Sep 02 '25 03:09 Bobholamovic

请提供使用的配置和最小可复现脚本，我们这边尝试复现一下。

from fastapi import FastAPI, File, UploadFile, HTTPException
from fastapi.responses import JSONResponse
import tempfile
import os
from paddlex import create_pipeline

# 初始化 FastAPI 应用
app = FastAPI()

max_file_size = 20 * 1024 * 1024

OCR_DEVICE = ''
os.environ['PADDLE_PDX_CACHE_HOME'] = r'./models'
structurev3_yaml_path = r"./PP-StructureV3.yaml"

pipeline_layout = create_pipeline(pipeline=structurev3_yaml_path)


@app.post("/parse-pdf/")
async def parse_pdf(file: UploadFile = File(...)):
    try:
        if file.size > max_file_size:
            raise HTTPException(status_code=400, detail="File size exceeds the limit of 20MB.")
        # 创建临时文件保存上传的 PDF
        with tempfile.NamedTemporaryFile(delete=False, suffix=".pdf") as temp_pdf:
            temp_pdf.write(await file.read())
            temp_pdf_path = temp_pdf.name
        print(f"temp_pdf_path:{temp_pdf_path}")
        markdown_texts = await get_pdf_content(temp_pdf_path)
        # 删除临时文件
        # os.unlink(temp_pdf_path)

        # 返回解析结果
        return JSONResponse(content={"status": "success", "data": markdown_texts})

    except Exception as e:
        # 捕获异常并返回错误信息
        return JSONResponse(content={"status": "error", "message": str(e)}, status_code=500)


async def get_pdf_content(input_file: str):
    if not os.path.exists(input_file):
        raise Exception("file_path is empty...")
    try:
        output = pipeline_layout.predict(input=input_file)
        markdown_list = []
        markdown_images = []

        for res in output:
            md_info = res.markdown
            markdown_list.append(md_info)
            markdown_images.append(md_info.get("markdown_images", {}))

        markdown_texts = pipeline_layout.concatenate_markdown_pages(markdown_list)
        return markdown_texts
    except Exception as e:
        raise Exception(f"get_pdf_content error:{e}")
    finally:
        release_gpu_memory()


def release_gpu_memory():
    """手动释放GPU显存"""
    if 'gpu' in OCR_DEVICE:
        import gc
        import paddle
        # logger.info("gpu release memory")
        # cuda.empty_cache()
        gc.collect()
        paddle.device.cuda.empty_cache()

yaml配置


pipeline_name: PP-StructureV3

batch_size: 8

use_doc_preprocessor: True
use_seal_recognition: True
use_table_recognition: True
use_formula_recognition: True
use_chart_recognition: True
use_region_detection: True

SubModules:
  LayoutDetection:
    module_name: layout_detection
    model_name: PP-DocLayout_plus-L
    model_dir: null
    batch_size: 8
    threshold:
      0: 0.3  # paragraph_title
      1: 0.5  # image
      2: 0.4  # text
      3: 0.5  # number
      4: 0.5  # abstract
      5: 0.5  # content
      6: 0.5  # figure_table_chart_title
      7: 0.3  # formula
      8: 0.5  # table
      9: 0.5  # reference
      10: 0.5 # doc_title
      11: 0.5 # footnote
      12: 0.5 # header
      13: 0.5 # algorithm
      14: 0.5 # footer
      15: 0.45 # seal
      16: 0.5 # chart
      17: 0.5 # formula_number
      18: 0.5 # aside_text
      19: 0.5 # reference_content
    layout_nms: True
    layout_unclip_ratio: [1.02, 1.02]
    layout_merge_bboxes_mode:
      0: "large"  # paragraph_title
      1: "large"  # image
      2: "union"  # text
      3: "union"  # number
      4: "union"  # abstract
      5: "union"  # content
      6: "union"  # figure_table_chart_title
      7: "large"  # formula
      8: "union"  # table
      9: "union"  # reference
      10: "union" # doc_title
      11: "union" # footnote
      12: "union" # header
      13: "union" # algorithm
      14: "union" # footer
      15: "union" # seal
      16: "large" # chart
      17: "union" # formula_number
      18: "union" # aside_text
      19: "union" # reference_content
  ChartRecognition:
    module_name: chart_recognition
    model_name: PP-Chart2Table
    model_dir: null
    batch_size: 1
  RegionDetection:
    module_name: layout_detection
    model_name: PP-DocBlockLayout
    model_dir: null
    layout_nms: True
    layout_merge_bboxes_mode: "small"

SubPipelines:
  DocPreprocessor:
    pipeline_name: doc_preprocessor
    batch_size: 8
    use_doc_orientation_classify: True
    use_doc_unwarping: True
    SubModules:
      DocOrientationClassify:
        module_name: doc_text_orientation
        model_name: PP-LCNet_x1_0_doc_ori
        model_dir: null
        batch_size: 8
      DocUnwarping:
        module_name: image_unwarping
        model_name: UVDoc
        model_dir: null

  GeneralOCR:
    pipeline_name: OCR
    batch_size: 8
    text_type: general
    use_doc_preprocessor: False
    use_textline_orientation: True
    SubModules:
      TextDetection:
        module_name: text_detection
        model_name: PP-OCRv5_server_det
        model_dir: null
        limit_side_len: 736
        limit_type: max
        max_side_limit: 4000
        thresh: 0.3
        box_thresh: 0.6
        unclip_ratio: 1.5
      TextLineOrientation:
        module_name: textline_orientation
        model_name: PP-LCNet_x0_25_textline_ori
        model_dir: null
        batch_size: 8
      TextRecognition:
        module_name: text_recognition
        model_name: PP-OCRv5_server_rec
        model_dir: null
        batch_size: 8
        score_thresh: 0.0


  TableRecognition:
    pipeline_name: table_recognition_v2
    use_layout_detection: False
    use_doc_preprocessor: False
    use_ocr_model: False
    SubModules:
      TableClassification:
        module_name: table_classification
        model_name: PP-LCNet_x1_0_table_cls
        model_dir: null

      WiredTableStructureRecognition:
        module_name: table_structure_recognition
        model_name: SLANeXt_wired
        model_dir: null

      WirelessTableStructureRecognition:
        module_name: table_structure_recognition
        model_name: SLANet_plus
        model_dir: null

      WiredTableCellsDetection:
        module_name: table_cells_detection
        model_name: RT-DETR-L_wired_table_cell_det
        model_dir: null

      WirelessTableCellsDetection:
        module_name: table_cells_detection
        model_name: RT-DETR-L_wireless_table_cell_det
        model_dir: null

      TableOrientationClassify:
        module_name: doc_text_orientation
        model_name: PP-LCNet_x1_0_doc_ori
        model_dir: null
    SubPipelines:
      GeneralOCR:
        pipeline_name: OCR
        text_type: general
        use_doc_preprocessor: False
        use_textline_orientation: True
        SubModules:
          TextDetection:
            module_name: text_detection
            model_name: PP-OCRv5_server_det
            model_dir: null
            limit_side_len: 736
            limit_type: max
            max_side_limit: 4000
            thresh: 0.3
            box_thresh: 0.4
            unclip_ratio: 1.5
          TextLineOrientation:
            module_name: textline_orientation
            model_name: PP-LCNet_x0_25_textline_ori
            model_dir: null
            batch_size: 8
          TextRecognition:
            module_name: text_recognition
            model_name: PP-OCRv5_server_rec
            model_dir: null
            batch_size: 8
        score_thresh: 0.0

  SealRecognition:
    pipeline_name: seal_recognition
    batch_size: 8
    use_layout_detection: False
    use_doc_preprocessor: False
    SubPipelines:
      SealOCR:
        pipeline_name: OCR
        batch_size: 8
        text_type: seal
        use_doc_preprocessor: False
        use_textline_orientation: False
        SubModules:
          TextDetection:
            module_name: seal_text_detection
            model_name: PP-OCRv4_server_seal_det
            model_dir: null
            limit_side_len: 736
            limit_type: max
            max_side_limit: 4000
            thresh: 0.2
            box_thresh: 0.6
            unclip_ratio: 0.5
          TextRecognition:
            module_name: text_recognition
            model_name: PP-OCRv5_server_rec
            model_dir: null
            batch_size: 8
            score_thresh: 0

  FormulaRecognition:
    pipeline_name: formula_recognition
    batch_size: 8
    use_layout_detection: False
    use_doc_preprocessor: False
    SubModules:
      FormulaRecognition:
        module_name: formula_recognition
        model_name: PP-FormulaNet_plus-L
        model_dir: null
        batch_size: 8

Sep 04 '25 03:09 loneWolf1127

@zhang-prog 辛苦看看

Sep 04 '25 05:09 Bobholamovic

同样遇到这个问题，请问有解决方法吗？

Sep 26 '25 08:09 fzkun

@zhang-prog 辛苦看看

请问这个问题目前有解决了吗？我们现在也遇到了同样的问题，

Sep 26 '25 08:09 Chlhacker

同问，这个问题解决了吗，使用del gc.collect()也无法释放占用的内存，会不断累积

Oct 12 '25 10:10 wangwenqi567