MinerU icon indicating copy to clipboard operation
MinerU copied to clipboard

FatalError: `Segmentation fault` is detected by the operating system.

Open Jalen-Zhong opened this issue 1 year ago • 19 comments

Description of the bug | 错误描述

我在isseue看到了相似的问题,但他们的解决方式都不适合我。命令行运行报错。请大佬帮我看看。 magic-pdf == 0.6.2b1

How to reproduce the bug | 如何复现

1. 命令 magic-pdf pdf-command --pdf "testfile_1.pdf" --inside_model true

2.日志 `2024-08-08 14:51:32.631 | WARNING | magic_pdf.cli.magicpdf:get_model_json:312 - not found json testfile_1.json existed 2024-08-08 14:51:32.631 | WARNING | magic_pdf.libs.config_reader:get_local_dir:64 - 'temp-output-dir' not found in magic-pdf.json, use '/tmp' as default 2024-08-08 14:51:32.798 | INFO | magic_pdf.libs.pdf_check:detect_invalid_chars:57 - cid_count: 0, text_len: 1, cid_chars_radio: 0.0 2024-08-08 14:51:32.798 | WARNING | magic_pdf.filter.pdf_classify_by_type:classify:334 - pdf is not classified by area and text_len, by_image_area: False, by_text: False, by_avg_words: False, by_img_num: True, by_text_layout: False, by_img_narrow_strips: True, by_invalid_chars: True INFO:datasets:PyTorch version 2.3.1 available. 2024-08-08 14:51:40.728 | INFO | magic_pdf.model.pdf_extract_kit:init:99 - DocAnalysis init, this may take some times. apply_layout: True, apply_formula: True, apply_ocr: True 2024-08-08 14:51:40.728 | INFO | magic_pdf.model.pdf_extract_kit:init:107 - using device: cuda 2024-08-08 14:51:40.729 | INFO | magic_pdf.model.pdf_extract_kit:init:109 - using models_dir: /root/.cache/modelscope/hub/wanderkid/PDF-Extract-Kit/models CustomVisionEncoderDecoderModel init CustomMBartForCausalLM init CustomMBartDecoder init [08/08 14:51:54 detectron2]: Rank of current process: 0. World size: 1 cuobjdump info : File '/root/anaconda3/envs/MinerU/lib/python3.10/site-packages/detectron2/_C.cpython-310-x86_64-linux-gnu.so' does not contain device code [08/08 14:51:54 detectron2]: Environment info:


sys.platform linux Python 3.10.14 (main, May 6 2024, 19:42:50) [GCC 11.2.0] numpy 1.26.4 detectron2 0.6 @/root/anaconda3/envs/MinerU/lib/python3.10/site-packages/detectron2 detectron2._C not built correctly: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /root/anaconda3/envs/MinerU/lib/python3.10/site-packages/detectron2/_C.cpython-310-x86_64-linux-gnu.so) Compiler ($CXX) c++ (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 CUDA compiler Build cuda_11.8.r11.8/compiler.31833905_0 detectron2 arch flags /root/anaconda3/envs/MinerU/lib/python3.10/site-packages/detectron2/_C.cpython-310-x86_64-linux-gnu.so DETECTRON2_ENV_MODULE PyTorch 2.3.1+cu121 @/root/anaconda3/envs/MinerU/lib/python3.10/site-packages/torch PyTorch debug build False torch._C._GLIBCXX_USE_CXX11_ABI False GPU available Yes GPU 0,1,2,3 NVIDIA GeForce RTX 3090 (arch=8.6) CUDA_HOME /usr/local/cuda Pillow 10.4.0 torchvision 0.18.1+cu121 @/root/anaconda3/envs/MinerU/lib/python3.10/site-packages/torchvision torchvision arch flags 5.0, 6.0, 7.0, 7.5, 8.0, 8.6, 9.0 fvcore 0.1.5.post20221221 iopath 0.1.9 cv2 4.6.0


PyTorch built with:

  • GCC 9.3
  • C++ Version: 201703
  • Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v3.3.6 (Git Hash 86e6af5974177e513fd3fee58425e1063e7f1361)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • LAPACK is enabled (usually provided by MKL)
  • NNPACK is enabled
  • CPU capability usage: AVX512
  • CUDA Runtime 12.1
  • NVCC architecture flags: -gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90
  • CuDNN 8.6 (built against CUDA 11.8)
    • Built with CuDNN 8.9.2
  • Magma 2.6.1
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.1, CUDNN_VERSION=8.9.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=2.3.1, USE_CUDA=ON, USE_CUDNN=ON, USE_CUSPARSELT=1, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF,

[08/08 14:51:54 detectron2]: Command line arguments: {'config_file': '/root/anaconda3/envs/MinerU/lib/python3.10/site-packages/magic_pdf/resources/model_config/layoutlmv3/layoutlmv3_base_inference.yaml', 'resume': False, 'eval_only': False, 'num_gpus': 1, 'num_machines': 1, 'machine_rank': 0, 'dist_url': 'tcp://127.0.0.1:57823', 'opts': ['MODEL.WEIGHTS', '/root/.cache/modelscope/hub/wanderkid/PDF-Extract-Kit/models/Layout/model_final.pth']} [08/08 14:51:54 detectron2]: Contents of args.config_file=/root/anaconda3/envs/MinerU/lib/python3.10/site-packages/magic_pdf/resources/model_config/layoutlmv3/layoutlmv3_base_inference.yaml: AUG: DETR: true CACHE_DIR: ~/cache/huggingface CUDNN_BENCHMARK: false DATALOADER: ASPECT_RATIO_GROUPING: true FILTER_EMPTY_ANNOTATIONS: false NUM_WORKERS: 4 REPEAT_THRESHOLD: 0.0 SAMPLER_TRAIN: TrainingSampler DATASETS: PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000 PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000 PROPOSAL_FILES_TEST: [] PROPOSAL_FILES_TRAIN: [] TEST:

  • scihub_train TRAIN:
  • scihub_train GLOBAL: HACK: 1.0 ICDAR_DATA_DIR_TEST: '' ICDAR_DATA_DIR_TRAIN: '' INPUT: CROP: ENABLED: true SIZE:
    • 384
    • 600 TYPE: absolute_range FORMAT: RGB MASK_FORMAT: polygon MAX_SIZE_TEST: 1333 MAX_SIZE_TRAIN: 1333 MIN_SIZE_TEST: 800 MIN_SIZE_TRAIN:
  • 480
  • 512
  • 544
  • 576
  • 608
  • 640
  • 672
  • 704
  • 736
  • 768
  • 800 MIN_SIZE_TRAIN_SAMPLING: choice RANDOM_FLIP: horizontal MODEL: ANCHOR_GENERATOR: ANGLES:
      • -90
      • 0
      • 90 ASPECT_RATIOS:
      • 0.5
      • 1.0
      • 2.0 NAME: DefaultAnchorGenerator OFFSET: 0.0 SIZES:
      • 32
      • 64
      • 128
      • 256
      • 512 BACKBONE: FREEZE_AT: 2 NAME: build_vit_fpn_backbone CONFIG_PATH: '' DEVICE: cuda FPN: FUSE_TYPE: sum IN_FEATURES:
    • layer3
    • layer5
    • layer7
    • layer11 NORM: '' OUT_CHANNELS: 256 IMAGE_ONLY: true KEYPOINT_ON: false LOAD_PROPOSALS: false MASK_ON: true META_ARCHITECTURE: VLGeneralizedRCNN PANOPTIC_FPN: COMBINE: ENABLED: true INSTANCES_CONFIDENCE_THRESH: 0.5 OVERLAP_THRESH: 0.5 STUFF_AREA_LIMIT: 4096 INSTANCE_LOSS_WEIGHT: 1.0 PIXEL_MEAN:
  • 127.5
  • 127.5
  • 127.5 PIXEL_STD:
  • 127.5
  • 127.5
  • 127.5 PROPOSAL_GENERATOR: MIN_SIZE: 0 NAME: RPN RESNETS: DEFORM_MODULATED: false DEFORM_NUM_GROUPS: 1 DEFORM_ON_PER_STAGE:
    • false
    • false
    • false
    • false DEPTH: 50 NORM: FrozenBN NUM_GROUPS: 1 OUT_FEATURES:
    • res4 RES2_OUT_CHANNELS: 256 RES5_DILATION: 1 STEM_OUT_CHANNELS: 64 STRIDE_IN_1X1: true WIDTH_PER_GROUP: 64 RETINANET: BBOX_REG_LOSS_TYPE: smooth_l1 BBOX_REG_WEIGHTS:
    • 1.0
    • 1.0
    • 1.0
    • 1.0 FOCAL_LOSS_ALPHA: 0.25 FOCAL_LOSS_GAMMA: 2.0 IN_FEATURES:
    • p3
    • p4
    • p5
    • p6
    • p7 IOU_LABELS:
    • 0
    • -1
    • 1 IOU_THRESHOLDS:
    • 0.4
    • 0.5 NMS_THRESH_TEST: 0.5 NORM: '' NUM_CLASSES: 10 NUM_CONVS: 4 PRIOR_PROB: 0.01 SCORE_THRESH_TEST: 0.05 SMOOTH_L1_LOSS_BETA: 0.1 TOPK_CANDIDATES_TEST: 1000 ROI_BOX_CASCADE_HEAD: BBOX_REG_WEIGHTS:
      • 10.0
      • 10.0
      • 5.0
      • 5.0
      • 20.0
      • 20.0
      • 10.0
      • 10.0
      • 30.0
      • 30.0
      • 15.0
      • 15.0 IOUS:
    • 0.5
    • 0.6
    • 0.7 ROI_BOX_HEAD: BBOX_REG_LOSS_TYPE: smooth_l1 BBOX_REG_LOSS_WEIGHT: 1.0 BBOX_REG_WEIGHTS:
    • 10.0
    • 10.0
    • 5.0
    • 5.0 CLS_AGNOSTIC_BBOX_REG: true CONV_DIM: 256 FC_DIM: 1024 NAME: FastRCNNConvFCHead NORM: '' NUM_CONV: 0 NUM_FC: 2 POOLER_RESOLUTION: 7 POOLER_SAMPLING_RATIO: 0 POOLER_TYPE: ROIAlignV2 SMOOTH_L1_BETA: 0.0 TRAIN_ON_PRED_BOXES: false ROI_HEADS: BATCH_SIZE_PER_IMAGE: 512 IN_FEATURES:
    • p2
    • p3
    • p4
    • p5 IOU_LABELS:
    • 0
    • 1 IOU_THRESHOLDS:
    • 0.5 NAME: CascadeROIHeads NMS_THRESH_TEST: 0.5 NUM_CLASSES: 10 POSITIVE_FRACTION: 0.25 PROPOSAL_APPEND_GT: true SCORE_THRESH_TEST: 0.05 ROI_KEYPOINT_HEAD: CONV_DIMS:
    • 512
    • 512
    • 512
    • 512
    • 512
    • 512
    • 512
    • 512 LOSS_WEIGHT: 1.0 MIN_KEYPOINTS_PER_IMAGE: 1 NAME: KRCNNConvDeconvUpsampleHead NORMALIZE_LOSS_BY_VISIBLE_KEYPOINTS: true NUM_KEYPOINTS: 17 POOLER_RESOLUTION: 14 POOLER_SAMPLING_RATIO: 0 POOLER_TYPE: ROIAlignV2 ROI_MASK_HEAD: CLS_AGNOSTIC_MASK: false CONV_DIM: 256 NAME: MaskRCNNConvUpsampleHead NORM: '' NUM_CONV: 4 POOLER_RESOLUTION: 14 POOLER_SAMPLING_RATIO: 0 POOLER_TYPE: ROIAlignV2 RPN: BATCH_SIZE_PER_IMAGE: 256 BBOX_REG_LOSS_TYPE: smooth_l1 BBOX_REG_LOSS_WEIGHT: 1.0 BBOX_REG_WEIGHTS:
    • 1.0
    • 1.0
    • 1.0
    • 1.0 BOUNDARY_THRESH: -1 CONV_DIMS:
    • -1 HEAD_NAME: StandardRPNHead IN_FEATURES:
    • p2
    • p3
    • p4
    • p5
    • p6 IOU_LABELS:
    • 0
    • -1
    • 1 IOU_THRESHOLDS:
    • 0.3
    • 0.7 LOSS_WEIGHT: 1.0 NMS_THRESH: 0.7 POSITIVE_FRACTION: 0.5 POST_NMS_TOPK_TEST: 1000 POST_NMS_TOPK_TRAIN: 2000 PRE_NMS_TOPK_TEST: 1000 PRE_NMS_TOPK_TRAIN: 2000 SMOOTH_L1_BETA: 0.0 SEM_SEG_HEAD: COMMON_STRIDE: 4 CONVS_DIM: 128 IGNORE_VALUE: 255 IN_FEATURES:
    • p2
    • p3
    • p4
    • p5 LOSS_WEIGHT: 1.0 NAME: SemSegFPNHead NORM: GN NUM_CLASSES: 10 VIT: DROP_PATH: 0.1 IMG_SIZE:
    • 224
    • 224 NAME: layoutlmv3_base OUT_FEATURES:
    • layer3
    • layer5
    • layer7
    • layer11 POS_TYPE: abs WEIGHTS: OUTPUT_DIR: SCIHUB_DATA_DIR_TRAIN: ~/publaynet/layout_scihub/train SEED: 42 SOLVER: AMP: ENABLED: true BACKBONE_MULTIPLIER: 1.0 BASE_LR: 0.0002 BIAS_LR_FACTOR: 1.0 CHECKPOINT_PERIOD: 2000 CLIP_GRADIENTS: CLIP_TYPE: full_model CLIP_VALUE: 1.0 ENABLED: true NORM_TYPE: 2.0 GAMMA: 0.1 GRADIENT_ACCUMULATION_STEPS: 1 IMS_PER_BATCH: 32 LR_SCHEDULER_NAME: WarmupCosineLR MAX_ITER: 20000 MOMENTUM: 0.9 NESTEROV: false OPTIMIZER: ADAMW REFERENCE_WORLD_SIZE: 0 STEPS:
  • 10000 WARMUP_FACTOR: 0.01 WARMUP_ITERS: 333 WARMUP_METHOD: linear WEIGHT_DECAY: 0.05 WEIGHT_DECAY_BIAS: null WEIGHT_DECAY_NORM: 0.0 TEST: AUG: ENABLED: false FLIP: true MAX_SIZE: 4000 MIN_SIZES:
    • 400
    • 500
    • 600
    • 700
    • 800
    • 900
    • 1000
    • 1100
    • 1200 DETECTIONS_PER_IMAGE: 100 EVAL_PERIOD: 1000 EXPECTED_RESULTS: [] KEYPOINT_OKS_SIGMAS: [] PRECISE_BN: ENABLED: false NUM_ITER: 200 VERSION: 2 VIS_PERIOD: 0

[08/08 14:51:56 d2.checkpoint.detection_checkpoint]: [DetectionCheckpointer] Loading from /root/.cache/modelscope/hub/wanderkid/PDF-Extract-Kit/models/Layout/model_final.pth ... [08/08 14:51:56 fvcore.common.checkpoint]: [Checkpointer] Loading from /root/.cache/modelscope/hub/wanderkid/PDF-Extract-Kit/models/Layout/model_final.pth ... 2024-08-08 14:51:57.268 | INFO | magic_pdf.model.pdf_extract_kit:init:132 - DocAnalysis init done! 2024-08-08 14:51:57.268 | INFO | magic_pdf.model.doc_analyze_by_custom_model:custom_model_init:92 - model init cost: 24.469878435134888`

3.Traceback `-------------------------------------- C++ Traceback (most recent call last):

0 at::_ops::conv2d::call(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::SymInt) 1 at::native::conv2d_symint(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::SymInt) 2 at::_ops::convolution::call(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, bool, c10::ArrayRefc10::SymInt, c10::SymInt) 3 at::_ops::convolution::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, bool, c10::ArrayRefc10::SymInt, c10::SymInt) 4 at::native::convolution(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, bool, c10::ArrayRef, long) 5 at::_ops::_convolution::call(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, bool, c10::ArrayRefc10::SymInt, c10::SymInt, bool, bool, bool, bool) 6 at::native::_convolution(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, bool, c10::ArrayRef, long, bool, bool, bool, bool) 7 at::_ops::cudnn_convolution::call(at::Tensor const&, at::Tensor const&, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::SymInt, bool, bool, bool) 8 at::native::cudnn_convolution(at::Tensor const&, at::Tensor const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, long, bool, bool, bool)


Error Message Summary:

FatalError: Segmentation fault is detected by the operating system. [TimeInfo: *** Aborted at 1723099917 (unix time) try "date -d @1723099917" if you are using GNU date ***] [SignalInfo: *** SIGSEGV (@0x59) received by PID 37094 (TID 0x7f617e0133c0) from PID 89 ***]

Segmentation fault`

Operating system | 操作系统

Linux

Python version | Python 版本

3.10

Software version | 软件版本 (magic-pdf --version)

0.6.x

Device mode | 设备模式

cuda

Jalen-Zhong avatar Aug 08 '24 06:08 Jalen-Zhong

有尝试按照这个教程https://github.com/opendatalab/MinerU/blob/master/docs/README_Ubuntu_CUDA_Acceleration_zh_CN.md 创建一个新的conda环境试试吗

myhloli avatar Aug 08 '24 07:08 myhloli

有尝试按照这个教程https://github.com/opendatalab/MinerU/blob/master/docs/README_Ubuntu_CUDA_Acceleration_zh_CN.md 创建一个新的conda环境试试吗

我的cuda和驱动版本如下: image 没有重新安装驱动,从第四步开始装虚拟环境都是按步骤来。

提问:目前cuda版本和驱动版本不一致会影响该项目的正常运行嘛,因为是公用服务器,没办法换驱动版本。

Jalen-Zhong avatar Aug 08 '24 07:08 Jalen-Zhong

驱动版本不一致没有太大问题,你这个问题是在教程中第9步还是第10步出现的?

myhloli avatar Aug 08 '24 08:08 myhloli

magic-pdf pdf-command --pdf "testfile_1.pdf" --inside_model true

刚刚测试了一下,设置cuda加速报错,如果是cpu没有问题。第9步报错,第10步也出现的问题。 另外测试效果和你们提供的demo链接(https://opendatalab.com/OpenSourceTools/Extractor/PDF)测试效果不一样,本地测试下来效果很差,demo链接测试效果很好。这是什么原因呀?

下面是测试结果: 本地: image

demo: image

Jalen-Zhong avatar Aug 08 '24 08:08 Jalen-Zhong

magic-pdf pdf-command --pdf "testfile_1.pdf" --inside_model true

刚刚测试了一下,设置cuda加速报错,如果是cpu没有问题。第9步报错,第10步也出现的问题。 另外测试效果和你们提供的demo链接(https://opendatalab.com/OpenSourceTools/Extractor/PDF)测试效果不一样,本地测试下来效果很差,demo链接测试效果很好。这是什么原因呀?%E6%B5%8B%E8%AF%95%E6%95%88%E6%9E%9C%E4%B8%8D%E4%B8%80%E6%A0%B7%EF%BC%8C%E6%9C%AC%E5%9C%B0%E6%B5%8B%E8%AF%95%E4%B8%8B%E6%9D%A5%E6%95%88%E6%9E%9C%E5%BE%88%E5%B7%AE%EF%BC%8Cdemo%E9%93%BE%E6%8E%A5%E6%B5%8B%E8%AF%95%E6%95%88%E6%9E%9C%E5%BE%88%E5%A5%BD%E3%80%82%E8%BF%99%E6%98%AF%E4%BB%80%E4%B9%88%E5%8E%9F%E5%9B%A0%E5%91%80%EF%BC%9F)

下面是测试结果: 本地: image

demo: image

样本pdf可以上传一份到这里,我们调试一下,教程第九步就开始出现问题的话,说明系统不兼容,可能要搞个ubuntu22.04的docker试试

myhloli avatar Aug 08 '24 08:08 myhloli

magic-pdf pdf-command --pdf "testfile_1.pdf" --inside_model true

刚刚测试了一下,设置cuda加速报错,如果是cpu没有问题。第9步报错,第10步也出现的问题。 另外测试效果和你们提供的demo链接(https://opendatalab.com/OpenSourceTools/Extractor/PDF)测试效果不一样,本地测试下来效果很差,demo链接测试效果很好。这是什么原因呀?%E6%B5%8B%E8%AF%95%E6%95%88%E6%9E%9C%E4%B8%8D%E4%B8%80%E6%A0%B7%EF%BC%8C%E6%9C%AC%E5%9C%B0%E6%B5%8B%E8%AF%95%E4%B8%8B%E6%9D%A5%E6%95%88%E6%9E%9C%E5%BE%88%E5%B7%AE%EF%BC%8Cdemo%E9%93%BE%E6%8E%A5%E6%B5%8B%E8%AF%95%E6%95%88%E6%9E%9C%E5%BE%88%E5%A5%BD%E3%80%82%E8%BF%99%E6%98%AF%E4%BB%80%E4%B9%88%E5%8E%9F%E5%9B%A0%E5%91%80%EF%BC%9F) 下面是测试结果: 本地: image demo: image

样本pdf可以上传一份到这里,我们调试一下,教程第九步就开始出现问题的话,说明系统不兼容,可能要搞个ubuntu22.04的docker试试

这里是我的几个测试用例,都是扫描版,包含纯文本、简单表格、复杂表格、图片等元素。另外文件都有页眉和水印,识别难度比较大。

testfile_1.pdf testfile_2.pdf testfile_3.pdf testfile_4.pdf

Jalen-Zhong avatar Aug 08 '24 08:08 Jalen-Zhong

我是按照https://github.com/opendatalab/MinerU/blob/master/docs/README_Ubuntu_CUDA_Acceleration_zh_CN.md来的,从第4步骤开始: 报错 不知道为什么 `

Error Message Summary:

FatalError: Segmentation fault is detected by the operating system. [TimeInfo: *** Aborted at 1723107561 (unix time) try "date -d @1723107561" if you are using GNU date ***] [SignalInfo: *** SIGSEGV (@0x20000002ef4) received by PID 1712537 (TID 0x7f131eb284c0) from PID 12020 ***]

Segmentation fault (core dumped) `

redpintings avatar Aug 08 '24 09:08 redpintings

我是按照https://github.com/opendatalab/MinerU/blob/master/docs/README_Ubuntu_CUDA_Acceleration_zh_CN.md来的,从第4步开始:报错不知道为什么

`

错误信息摘要:

FatalError:Segmentation fault被操作系统检测到。 [TimeInfo: *** 于 1723107561(unix 时间)中止,如果您使用的是 GNU Date,请尝试“date -d @1723107561” ***] [SignalInfo: *** SIGSEGV (@0x20000002ef4)由 PID 1712537(TID 0x7f131eb284c0)从 PID 12020 收到 ***]

分段错误(核心转储) `

我是从第4步开始一步步来到第8步跑demo就出现这个错误,无论cpu还是cuda,都出现这个错误

redpintings avatar Aug 08 '24 09:08 redpintings

我是按照https://github.com/opendatalab/MinerU/blob/master/docs/README_Ubuntu_CUDA_Acceleration_zh_CN.md来的,从第4步开始:报错不知道为什么

`

错误信息摘要:

FatalError:Segmentation fault被操作系统检测到。 [TimeInfo: *** 于 1723107561(unix 时间)中止,如果您使用的是 GNU Date,请尝试“date -d @1723107561” ***] [SignalInfo: *** SIGSEGV (@0x20000002ef4)由 PID 1712537(TID 0x7f131eb284c0)从 PID 12020 收到 ***] 分段错误(核心转储) `

我是从第4步开始一步步来到第8步跑demo就出现这个错误,无论cpu还是cuda,都出现这个错误

报错的堆栈要完整上传一下

myhloli avatar Aug 08 '24 09:08 myhloli

`(min) bigdata@gpu2 Miner $ magic-pdf pdf-command --pdf small_ocr.pdf 2024-08-08 17:08:21.760 | WARNING | magic_pdf.cli.magicpdf:get_model_json:312 - not found json small_ocr.json existed 2024-08-08 17:08:21.761 | WARNING | magic_pdf.libs.config_reader:get_local_dir:64 - 'temp-output-dir' not found in magic-pdf.json, use '/tmp' as default 2024-08-08 17:08:23.027 | INFO | magic_pdf.libs.pdf_check:detect_invalid_chars:57 - cid_count: 0, text_len: 8, cid_chars_radio: 0.0 2024-08-08 17:08:23.031 | WARNING | magic_pdf.filter.pdf_classify_by_type:classify:334 - pdf is not classified by area and text_len, by_image_area: False, by_text: False, by_avg_words: False, by_img_num: True, by_text_layout: False, by_img_narrow_strips: False, by_invalid_chars: True INFO:datasets:PyTorch version 2.3.1+cu118 available. 2024-08-08 17:08:30.863 | INFO | magic_pdf.model.pdf_extract_kit:init:99 - DocAnalysis init, this may take some times. apply_layout: True, apply_formula: True, apply_ocr: True 2024-08-08 17:08:30.863 | INFO | magic_pdf.model.pdf_extract_kit:init:107 - using device: cpu 2024-08-08 17:08:30.863 | INFO | magic_pdf.model.pdf_extract_kit:init:109 - using models_dir: /home/bigdata/projects/ysl/paint/Miner/PDF-Extract-Kit/models CustomVisionEncoderDecoderModel init CustomMBartForCausalLM init CustomMBartDecoder init [08/08 17:08:49 detectron2]: Rank of current process: 0. World size: 1 /bin/sh: 1: /usr/local/cuda-12.2:/bin/nvcc: not found [08/08 17:08:52 detectron2]: Environment info:


sys.platform linux Python 3.10.14 (main, May 6 2024, 19:42:50) [GCC 11.2.0] numpy 1.26.3 detectron2 0.6 @/home/bigdata/.conda/envs/min/lib/python3.10/site-packages/detectron2 detectron2._C not built correctly: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /home/bigdata/.conda/envs/min/lib/python3.10/site-packages/detectron2/_C.cpython-310-x86_64-linux-gnu.so) Compiler ($CXX) c++ (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0 CUDA compiler Not found detectron2 arch flags /home/bigdata/.conda/envs/min/lib/python3.10/site-packages/detectron2/_C.cpython-310-x86_64-linux-gnu.so; cannot find cuobjdump DETECTRON2_ENV_MODULE PyTorch 2.3.1+cu118 @/home/bigdata/.conda/envs/min/lib/python3.10/site-packages/torch PyTorch debug build False torch._C._GLIBCXX_USE_CXX11_ABI False GPU available Yes GPU 0,1,2,3,4,5,6,7,8,9 NVIDIA A100-SXM4-40GB (arch=8.0) Driver version 535.104.05 CUDA_HOME /usr/local/cuda-12.2: - invalid! Pillow 10.2.0 torchvision 0.18.1+cu118 @/home/bigdata/.conda/envs/min/lib/python3.10/site-packages/torchvision torchvision arch flags /home/bigdata/.conda/envs/min/lib/python3.10/site-packages/torchvision/_C.so; cannot find cuobjdump fvcore 0.1.5.post20221221 iopath 0.1.9 cv2 4.6.0


PyTorch built with:

  • GCC 9.3
  • C++ Version: 201703
  • Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v3.3.6 (Git Hash 86e6af5974177e513fd3fee58425e1063e7f1361)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • LAPACK is enabled (usually provided by MKL)
  • NNPACK is enabled
  • CPU capability usage: AVX512
  • CUDA Runtime 11.8
  • NVCC architecture flags: -gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_90,code=sm_90
  • CuDNN 8.9.7 (built against CUDA 12.2)
    • Built with CuDNN 8.7
  • Magma 2.6.1
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.8, CUDNN_VERSION=8.7.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=2.3.1, USE_CUDA=ON, USE_CUDNN=ON, USE_CUSPARSELT=1, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF,

[08/08 17:08:52 detectron2]: Command line arguments: {'config_file': '/home/bigdata/.conda/envs/min/lib/python3.10/site-packages/magic_pdf/resources/model_config/layoutlmv3/layoutlmv3_base_inference.yaml', 'resume': False, 'eval_only': False, 'num_gpus': 1, 'num_machines': 1, 'machine_rank': 0, 'dist_url': 'tcp://127.0.0.1:57823', 'opts': ['MODEL.WEIGHTS', '/home/bigdata/projects/ysl/paint/Miner/PDF-Extract-Kit/models/Layout/model_final.pth']} [08/08 17:08:52 detectron2]: Contents of args.config_file=/home/bigdata/.conda/envs/min/lib/python3.10/site-packages/magic_pdf/resources/model_config/layoutlmv3/layoutlmv3_base_inference.yaml: AUG: DETR: true CACHE_DIR: ~/cache/huggingface CUDNN_BENCHMARK: false DATALOADER: ASPECT_RATIO_GROUPING: true FILTER_EMPTY_ANNOTATIONS: false NUM_WORKERS: 4 REPEAT_THRESHOLD: 0.0 SAMPLER_TRAIN: TrainingSampler DATASETS: PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000 PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000 PROPOSAL_FILES_TEST: [] PROPOSAL_FILES_TRAIN: [] TEST:

  • scihub_train TRAIN:
  • scihub_train GLOBAL: HACK: 1.0 ICDAR_DATA_DIR_TEST: '' ICDAR_DATA_DIR_TRAIN: '' INPUT: CROP: ENABLED: true SIZE:
    • 384
    • 600 TYPE: absolute_range FORMAT: RGB MASK_FORMAT: polygon MAX_SIZE_TEST: 1333 MAX_SIZE_TRAIN: 1333 MIN_SIZE_TEST: 800 MIN_SIZE_TRAIN:
  • 480
  • 512
  • 544
  • 576
  • 608
  • 640
  • 672
  • 704
  • 736
  • 768
  • 800 MIN_SIZE_TRAIN_SAMPLING: choice RANDOM_FLIP: horizontal MODEL: ANCHOR_GENERATOR: ANGLES:
      • -90
      • 0
      • 90 ASPECT_RATIOS:
      • 0.5
      • 1.0
      • 2.0 NAME: DefaultAnchorGenerator OFFSET: 0.0 SIZES:
      • 32
      • 64
      • 128
      • 256
      • 512 BACKBONE: FREEZE_AT: 2 NAME: build_vit_fpn_backbone CONFIG_PATH: '' DEVICE: cuda FPN: FUSE_TYPE: sum IN_FEATURES:
    • layer3
    • layer5
    • layer7
    • layer11 NORM: '' OUT_CHANNELS: 256 IMAGE_ONLY: true KEYPOINT_ON: false LOAD_PROPOSALS: false MASK_ON: true META_ARCHITECTURE: VLGeneralizedRCNN PANOPTIC_FPN: COMBINE: ENABLED: true INSTANCES_CONFIDENCE_THRESH: 0.5 OVERLAP_THRESH: 0.5 STUFF_AREA_LIMIT: 4096 INSTANCE_LOSS_WEIGHT: 1.0 PIXEL_MEAN:
  • 127.5
  • 127.5
  • 127.5 PIXEL_STD:
  • 127.5
  • 127.5
  • 127.5 PROPOSAL_GENERATOR: MIN_SIZE: 0 NAME: RPN RESNETS: DEFORM_MODULATED: false DEFORM_NUM_GROUPS: 1 DEFORM_ON_PER_STAGE:
    • false
    • false
    • false
    • false DEPTH: 50 NORM: FrozenBN NUM_GROUPS: 1 OUT_FEATURES:
    • res4 RES2_OUT_CHANNELS: 256 RES5_DILATION: 1 STEM_OUT_CHANNELS: 64 STRIDE_IN_1X1: true WIDTH_PER_GROUP: 64 RETINANET: BBOX_REG_LOSS_TYPE: smooth_l1 BBOX_REG_WEIGHTS:
    • 1.0
    • 1.0
    • 1.0
    • 1.0 FOCAL_LOSS_ALPHA: 0.25 FOCAL_LOSS_GAMMA: 2.0 IN_FEATURES:
    • p3
    • p4
    • p5
    • p6
    • p7 IOU_LABELS:
    • 0
    • -1
    • 1 IOU_THRESHOLDS:
    • 0.4
    • 0.5 NMS_THRESH_TEST: 0.5 NORM: '' NUM_CLASSES: 10 NUM_CONVS: 4 PRIOR_PROB: 0.01 SCORE_THRESH_TEST: 0.05 SMOOTH_L1_LOSS_BETA: 0.1 TOPK_CANDIDATES_TEST: 1000 ROI_BOX_CASCADE_HEAD: BBOX_REG_WEIGHTS:
      • 10.0
      • 10.0
      • 5.0
      • 5.0
      • 20.0
      • 20.0
      • 10.0
      • 10.0
      • 30.0
      • 30.0
      • 15.0
      • 15.0 IOUS:
    • 0.5
    • 0.6
    • 0.7 ROI_BOX_HEAD: BBOX_REG_LOSS_TYPE: smooth_l1 BBOX_REG_LOSS_WEIGHT: 1.0 BBOX_REG_WEIGHTS:
    • 10.0
    • 10.0
    • 5.0
    • 5.0 CLS_AGNOSTIC_BBOX_REG: true CONV_DIM: 256 FC_DIM: 1024 NAME: FastRCNNConvFCHead NORM: '' NUM_CONV: 0 NUM_FC: 2 POOLER_RESOLUTION: 7 POOLER_SAMPLING_RATIO: 0 POOLER_TYPE: ROIAlignV2 SMOOTH_L1_BETA: 0.0 TRAIN_ON_PRED_BOXES: false ROI_HEADS: BATCH_SIZE_PER_IMAGE: 512 IN_FEATURES:
    • p2
    • p3
    • p4
    • p5 IOU_LABELS:
    • 0
    • 1 IOU_THRESHOLDS:
    • 0.5 NAME: CascadeROIHeads NMS_THRESH_TEST: 0.5 NUM_CLASSES: 10 POSITIVE_FRACTION: 0.25 PROPOSAL_APPEND_GT: true SCORE_THRESH_TEST: 0.05 ROI_KEYPOINT_HEAD: CONV_DIMS:
    • 512
    • 512
    • 512
    • 512
    • 512
    • 512
    • 512
    • 512 LOSS_WEIGHT: 1.0 MIN_KEYPOINTS_PER_IMAGE: 1 NAME: KRCNNConvDeconvUpsampleHead NORMALIZE_LOSS_BY_VISIBLE_KEYPOINTS: true NUM_KEYPOINTS: 17 POOLER_RESOLUTION: 14 POOLER_SAMPLING_RATIO: 0 POOLER_TYPE: ROIAlignV2 ROI_MASK_HEAD: CLS_AGNOSTIC_MASK: false CONV_DIM: 256 NAME: MaskRCNNConvUpsampleHead NORM: '' NUM_CONV: 4 POOLER_RESOLUTION: 14 POOLER_SAMPLING_RATIO: 0 POOLER_TYPE: ROIAlignV2 RPN: BATCH_SIZE_PER_IMAGE: 256 BBOX_REG_LOSS_TYPE: smooth_l1 BBOX_REG_LOSS_WEIGHT: 1.0 BBOX_REG_WEIGHTS:
    • 1.0
    • 1.0
    • 1.0
    • 1.0 BOUNDARY_THRESH: -1 CONV_DIMS:
    • -1 HEAD_NAME: StandardRPNHead IN_FEATURES:
    • p2
    • p3
    • p4
    • p5
    • p6 IOU_LABELS:
    • 0
    • -1
    • 1 IOU_THRESHOLDS:
    • 0.3
    • 0.7 LOSS_WEIGHT: 1.0 NMS_THRESH: 0.7 POSITIVE_FRACTION: 0.5 POST_NMS_TOPK_TEST: 1000 POST_NMS_TOPK_TRAIN: 2000 PRE_NMS_TOPK_TEST: 1000 PRE_NMS_TOPK_TRAIN: 2000 SMOOTH_L1_BETA: 0.0 SEM_SEG_HEAD: COMMON_STRIDE: 4 CONVS_DIM: 128 IGNORE_VALUE: 255 IN_FEATURES:
    • p2
    • p3
    • p4
    • p5 LOSS_WEIGHT: 1.0 NAME: SemSegFPNHead NORM: GN NUM_CLASSES: 10 VIT: DROP_PATH: 0.1 IMG_SIZE:
    • 224
    • 224 NAME: layoutlmv3_base OUT_FEATURES:
    • layer3
    • layer5
    • layer7
    • layer11 POS_TYPE: abs WEIGHTS: OUTPUT_DIR: SCIHUB_DATA_DIR_TRAIN: ~/publaynet/layout_scihub/train SEED: 42 SOLVER: AMP: ENABLED: true BACKBONE_MULTIPLIER: 1.0 BASE_LR: 0.0002 BIAS_LR_FACTOR: 1.0 CHECKPOINT_PERIOD: 2000 CLIP_GRADIENTS: CLIP_TYPE: full_model CLIP_VALUE: 1.0 ENABLED: true NORM_TYPE: 2.0 GAMMA: 0.1 GRADIENT_ACCUMULATION_STEPS: 1 IMS_PER_BATCH: 32 LR_SCHEDULER_NAME: WarmupCosineLR MAX_ITER: 20000 MOMENTUM: 0.9 NESTEROV: false OPTIMIZER: ADAMW REFERENCE_WORLD_SIZE: 0 STEPS:
  • 10000 WARMUP_FACTOR: 0.01 WARMUP_ITERS: 333 WARMUP_METHOD: linear WEIGHT_DECAY: 0.05 WEIGHT_DECAY_BIAS: null WEIGHT_DECAY_NORM: 0.0 TEST: AUG: ENABLED: false FLIP: true MAX_SIZE: 4000 MIN_SIZES:
    • 400
    • 500
    • 600
    • 700
    • 800
    • 900
    • 1000
    • 1100
    • 1200 DETECTIONS_PER_IMAGE: 100 EVAL_PERIOD: 1000 EXPECTED_RESULTS: [] KEYPOINT_OKS_SIGMAS: [] PRECISE_BN: ENABLED: false NUM_ITER: 200 VERSION: 2 VIS_PERIOD: 0

[08/08 17:08:54 d2.checkpoint.detection_checkpoint]: [DetectionCheckpointer] Loading from /home/bigdata/projects/ysl/paint/Miner/PDF-Extract-Kit/models/Layout/model_final.pth ... [08/08 17:08:54 fvcore.common.checkpoint]: [Checkpointer] Loading from /home/bigdata/projects/ysl/paint/Miner/PDF-Extract-Kit/models/Layout/model_final.pth ... 2024-08-08 17:08:55.775 | INFO | magic_pdf.model.pdf_extract_kit:init:132 - DocAnalysis init done! 2024-08-08 17:08:55.775 | INFO | magic_pdf.model.doc_analyze_by_custom_model:custom_model_init:92 - model init cost: 32.74393916130066 2024-08-08 17:09:07.147 | INFO | magic_pdf.model.pdf_extract_kit:call:143 - layout detection cost: 10.82


C++ Traceback (most recent call last):

0 at::_ops::conv2d::call(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::SymInt) 1 at::native::conv2d_symint(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::SymInt) 2 at::_ops::convolution::call(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, bool, c10::ArrayRefc10::SymInt, c10::SymInt) 3 at::native::convolution(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, bool, c10::ArrayRef, long) 4 at::_ops::_convolution::call(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, bool, c10::ArrayRefc10::SymInt, c10::SymInt, bool, bool, bool, bool) 5 at::native::_convolution(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, bool, c10::ArrayRef, long, bool, bool, bool, bool) 6 at::_ops::cudnn_convolution::call(at::Tensor const&, at::Tensor const&, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::SymInt, bool, bool, bool) 7 at::native::cudnn_convolution(at::Tensor const&, at::Tensor const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, long, bool, bool, bool)


Error Message Summary:

FatalError: Segmentation fault is detected by the operating system. [TimeInfo: *** Aborted at 1723108148 (unix time) try "date -d @1723108148" if you are using GNU date ***] [SignalInfo: *** SIGSEGV (@0x20000002ef4) received by PID 1732394 (TID 0x7f9d0347b4c0) from PID 12020 ***]

Segmentation fault (core dumped) (min) bigdata@gpu2 Miner $ ` @myhloli

redpintings avatar Aug 08 '24 09:08 redpintings

PyTorch 2.3.1+cu118

如果是按照教程从第四步开始装的话,pytorch 不会是cu118的版本,正常教程安装pytorch 是cu121的,你这个真的有按教程安装吗?

myhloli avatar Aug 08 '24 09:08 myhloli

PyTorch 2.3.1+cu118

如果是按照教程从第四步开始装的话,pytorch 不会是cu118的版本,正常教程安装pytorch 是cu121的,你这个真的有按教程安装吗? sorry 是我的问题,我重新安装了conda 环境目前使用cuda 正常运行 当我使用这个命令CUDA_VISIBLE_DEVICES=1 magic-pdf pdf-command --pdf ./1.pdf --inside_model true 处理这个pdf 文件时候只显示了图片 magic-pdf, version 0.6.2b1 1.pdf

redpintings avatar Aug 08 '24 09:08 redpintings

PyTorch 2.3.1+cu118

如果是按照教程从第四步开始装的话,pytorch 不会是cu118的版本,正常教程安装pytorch 是cu121的,你这个真的有按教程安装吗? sorry 是我的问题,我重新安装了conda 环境目前使用cuda 正常运行 当我使用这个命令CUDA_VISIBLE_DEVICES=1 magic-pdf pdf-command --pdf ./1.pdf --inside_model true 处理这个pdf 文件时候只显示了图片 magic-pdf, version 0.6.2b1 1.pdf

这是正常的,因为在0.6.x版本中没有表格解析功能

myhloli avatar Aug 08 '24 09:08 myhloli

magic-pdf pdf-command --pdf "testfile_1.pdf" --inside_model true

刚刚测试了一下,设置cuda加速报错,如果是cpu没有问题。第9步报错,第10步也出现的问题。 另外测试效果和你们提供的demo链接(https://opendatalab.com/OpenSourceTools/Extractor/PDF)测试效果不一样,本地测试下来效果很差,demo链接测试效果很好。这是什么原因呀?%E6%B5%8B%E8%AF%95%E6%95%88%E6%9E%9C%E4%B8%8D%E4%B8%80%E6%A0%B7%EF%BC%8C%E6%9C%AC%E5%9C%B0%E6%B5%8B%E8%AF%95%E4%B8%8B%E6%9D%A5%E6%95%88%E6%9E%9C%E5%BE%88%E5%B7%AE%EF%BC%8Cdemo%E9%93%BE%E6%8E%A5%E6%B5%8B%E8%AF%95%E6%95%88%E6%9E%9C%E5%BE%88%E5%A5%BD%E3%80%82%E8%BF%99%E6%98%AF%E4%BB%80%E4%B9%88%E5%8E%9F%E5%9B%A0%E5%91%80%EF%BC%9F) 下面是测试结果: 本地: image demo: image

样本pdf可以上传一份到这里,我们调试一下,教程第九步就开始出现问题的话,说明系统不兼容,可能要搞个ubuntu22.04的docker试试

这里是我的几个测试用例,都是扫描版,包含纯文本、简单表格、复杂表格、图片等元素。另外文件都有页眉和水印,识别难度比较大。

testfile_1.pdf testfile_2.pdf testfile_3.pdf testfile_4.pdf

我这边本地测试结果和在线效果一致,我把本地解析结果发你看下: output.zip

或者你也可以打包一下输出目录的所有文件,供我们分析。

myhloli avatar Aug 08 '24 11:08 myhloli

magic-pdf pdf-command --pdf "testfile_1.pdf" --inside_model true

刚刚测试了一下,设置cuda加速报错,如果是cpu没有问题。第9步报错,第10步也出现的问题。 另外测试效果和你们提供的demo链接(https://opendatalab.com/OpenSourceTools/Extractor/PDF)测试效果不一样,本地测试下来效果很差,demo链接测试效果很好。这是什么原因呀?%E6%B5%8B%E8%AF%95%E6%95%88%E6%9E%9C%E4%B8%8D%E4%B8%80%E6%A0%B7%EF%BC%8C%E6%9C%AC%E5%9C%B0%E6%B5%8B%E8%AF%95%E4%B8%8B%E6%9D%A5%E6%95%88%E6%9E%9C%E5%BE%88%E5%B7%AE%EF%BC%8Cdemo%E9%93%BE%E6%8E%A5%E6%B5%8B%E8%AF%95%E6%95%88%E6%9E%9C%E5%BE%88%E5%A5%BD%E3%80%82%E8%BF%99%E6%98%AF%E4%BB%80%E4%B9%88%E5%8E%9F%E5%9B%A0%E5%91%80%EF%BC%9F) 下面是测试结果: 本地: image demo: image

样本pdf可以上传一份到这里,我们调试一下,教程第九步就开始出现问题的话,说明系统不兼容,可能要搞个ubuntu22.04的docker试试

这里是我的几个测试用例,都是扫描版,包含纯文本、简单表格、复杂表格、图片等元素。另外文件都有页眉和水印,识别难度比较大。 testfile_1.pdf testfile_2.pdf testfile_3.pdf testfile_4.pdf

我这边本地测试结果和在线效果一致,我把本地解析结果发你看下: output.zip

或者你也可以打包一下输出目录的所有文件,供我们分析。

我这边因cpu处理速度慢的原因只测试了testfile_1.pdf这个文件,下面是这个文件的输出目录,包括autoorc两种处理方式,辛苦大佬~: testfile_1.zip

Jalen-Zhong avatar Aug 09 '24 01:08 Jalen-Zhong

magic-pdf pdf-command --pdf "testfile_1.pdf" --inside_model true

刚刚测试了一下,设置cuda加速报错,如果是cpu没有问题。第9步报错,第10步也出现的问题。 另外测试效果和你们提供的demo链接(https://opendatalab.com/OpenSourceTools/Extractor/PDF)测试效果不一样,本地测试下来效果很差,demo链接测试效果很好。这是什么原因呀?%E6%B5%8B%E8%AF%95%E6%95%88%E6%9E%9C%E4%B8%8D%E4%B8%80%E6%A0%B7%EF%BC%8C%E6%9C%AC%E5%9C%B0%E6%B5%8B%E8%AF%95%E4%B8%8B%E6%9D%A5%E6%95%88%E6%9E%9C%E5%BE%88%E5%B7%AE%EF%BC%8Cdemo%E9%93%BE%E6%8E%A5%E6%B5%8B%E8%AF%95%E6%95%88%E6%9E%9C%E5%BE%88%E5%A5%BD%E3%80%82%E8%BF%99%E6%98%AF%E4%BB%80%E4%B9%88%E5%8E%9F%E5%9B%A0%E5%91%80%EF%BC%9F) 下面是测试结果: 本地: image demo: image

样本pdf可以上传一份到这里,我们调试一下,教程第九步就开始出现问题的话,说明系统不兼容,可能要搞个ubuntu22.04的docker试试

这里是我的几个测试用例,都是扫描版,包含纯文本、简单表格、复杂表格、图片等元素。另外文件都有页眉和水印,识别难度比较大。 testfile_1.pdf testfile_2.pdf testfile_3.pdf testfile_4.pdf

我这边本地测试结果和在线效果一致,我把本地解析结果发你看下: output.zip 或者你也可以打包一下输出目录的所有文件,供我们分析。

我这边因cpu处理速度慢的原因只测试了testfile_1.pdf这个文件,下面是这个文件的输出目录,包括autoorc两种处理方式,辛苦大佬~: testfile_1.zip

看了下中间过程文件,是有一些不应该出现的公式区域影响了解析效果,不清楚是不是依赖库版本不兼容导致的,如果可以的话,请运行pip list并上传结果供我们分析

myhloli avatar Aug 09 '24 02:08 myhloli

magic-pdf pdf-command --pdf "testfile_1.pdf" --inside_model true

刚刚测试了一下,设置cuda加速报错,如果是cpu没有问题。第9步报错,第10步也出现的问题。 另外测试效果和你们提供的demo链接(https://opendatalab.com/OpenSourceTools/Extractor/PDF)测试效果不一样,本地测试下来效果很差,demo链接测试效果很好。这是什么原因呀?%E6%B5%8B%E8%AF%95%E6%95%88%E6%9E%9C%E4%B8%8D%E4%B8%80%E6%A0%B7%EF%BC%8C%E6%9C%AC%E5%9C%B0%E6%B5%8B%E8%AF%95%E4%B8%8B%E6%9D%A5%E6%95%88%E6%9E%9C%E5%BE%88%E5%B7%AE%EF%BC%8Cdemo%E9%93%BE%E6%8E%A5%E6%B5%8B%E8%AF%95%E6%95%88%E6%9E%9C%E5%BE%88%E5%A5%BD%E3%80%82%E8%BF%99%E6%98%AF%E4%BB%80%E4%B9%88%E5%8E%9F%E5%9B%A0%E5%91%80%EF%BC%9F) 下面是测试结果: 本地: image demo: image

样本pdf可以上传一份到这里,我们调试一下,教程第九步就开始出现问题的话,说明系统不兼容,可能要搞个ubuntu22.04的docker试试

这里是我的几个测试用例,都是扫描版,包含纯文本、简单表格、复杂表格、图片等元素。另外文件都有页眉和水印,识别难度比较大。 testfile_1.pdf testfile_2.pdf testfile_3.pdf testfile_4.pdf

我这边本地测试结果和在线效果一致,我把本地解析结果发你看下: output.zip 或者你也可以打包一下输出目录的所有文件,供我们分析。

我这边因cpu处理速度慢的原因只测试了testfile_1.pdf这个文件,下面是这个文件的输出目录,包括autoorc两种处理方式,辛苦大佬~: testfile_1.zip

看了下中间过程文件,是有一些不应该出现的公式区域影响了解析效果,不清楚是不是依赖库版本不兼容导致的,如果可以的话,请运行pip list并上传结果供我们分析

` Package Version


absl-py 2.1.0 aiohappyeyeballs 2.3.5 aiohttp 3.10.1 aiosignal 1.3.1 albucore 0.0.13 albumentations 1.4.13 annotated-types 0.7.0 antlr4-python3-runtime 4.9.3 anyio 4.4.0 astor 0.8.1 async-timeout 4.0.3 attrdict 2.0.1 attrs 24.2.0 Babel 2.15.0 bce-python-sdk 0.9.19 beautifulsoup4 4.12.3 black 24.8.0 blinker 1.8.2 boto3 1.34.156 botocore 1.34.156 braceexpand 0.1.7 Brotli 1.1.0 cachetools 5.4.0 certifi 2024.7.4 cffi 1.17.0 charset-normalizer 3.3.2 click 8.1.7 cloudpickle 3.0.0 colorlog 6.8.2 contourpy 1.2.1 cryptography 43.0.0 cssselect 1.2.0 cssutils 2.11.1 cycler 0.12.1 Cython 3.0.11 datasets 2.20.0 decorator 5.1.1 detectron2 0.6 dill 0.3.8 et-xmlfile 1.1.0 eva-decord 0.6.1 eval_type_backport 0.2.0 evaluate 0.4.2 exceptiongroup 1.2.2 fairscale 0.4.13 fast-langdetect 0.2.0 fasttext-wheel 0.9.2 filelock 3.15.4 fire 0.6.0 Flask 3.0.3 flask-babel 4.0.0 fonttools 4.53.1 frozenlist 1.4.1 fsspec 2024.5.0 ftfy 6.2.3 future 1.0.0 fvcore 0.1.5.post20221221 grpcio 1.65.4 h11 0.14.0 httpcore 1.0.5 httpx 0.27.0 huggingface-hub 0.24.5 hydra-core 1.3.2 idna 3.7 imageio 2.34.2 imgaug 0.4.0 iopath 0.1.9 itsdangerous 2.2.0 Jinja2 3.1.4 jmespath 1.0.1 joblib 1.4.2 kiwisolver 1.4.5 langdetect 1.0.9 lazy_loader 0.4 lmdb 1.5.1 loguru 0.7.2 lxml 5.2.2 magic-pdf 0.6.2b1 Markdown 3.6 MarkupSafe 2.1.5 matplotlib 3.9.1.post1 more-itertools 10.4.0 mpmath 1.3.0 multidict 6.0.5 multiprocess 0.70.16 mypy-extensions 1.0.0 networkx 3.3 numpy 1.26.4 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu12 8.9.2.26 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu12 12.1.0.106 nvidia-nccl-cu12 2.20.5 nvidia-nvjitlink-cu12 12.6.20 nvidia-nvtx-cu12 12.1.105 omegaconf 2.3.0 opencv-contrib-python 4.6.0.66 opencv-python 4.6.0.66 opencv-python-headless 4.10.0.84 openpyxl 3.1.5 opt-einsum 3.3.0 packaging 24.1 paddleocr 2.7.3 paddlepaddle 3.0.0b1 pandas 2.2.2 pathspec 0.12.1 pdf2docx 0.5.8 pdfminer.six 20231228 pillow 10.4.0 pip 24.0 platformdirs 4.2.2 portalocker 2.10.1 premailer 3.10.0 protobuf 4.25.4 psutil 6.0.0 py-cpuinfo 9.0.0 pyarrow 17.0.0 pyarrow-hotfix 0.6 pybind11 2.13.1 pyclipper 1.3.0.post5 pycocotools 2.0.8 pycparser 2.22 pycryptodome 3.20.0 pydantic 2.8.2 pydantic_core 2.20.1 PyMuPDF 1.24.9 PyMuPDFb 1.24.9 pyparsing 3.1.2 python-dateutil 2.9.0.post0 python-docx 1.1.2 pytz 2024.1 PyYAML 6.0.2 rapidfuzz 3.9.6 rarfile 4.2 regex 2024.7.24 requests 2.32.3 robust-downloader 0.0.2 s3transfer 0.10.2 safetensors 0.4.4 scikit-image 0.24.0 scikit-learn 1.5.1 scipy 1.14.0 seaborn 0.13.2 setuptools 72.1.0 shapely 2.0.5 six 1.16.0 sniffio 1.3.1 soupsieve 2.5 sympy 1.13.1 tabulate 0.9.0 tensorboard 2.17.0 tensorboard-data-server 0.7.2 termcolor 2.4.0 threadpoolctl 3.5.0 tifffile 2024.7.24 timm 0.9.16 tokenizers 0.19.1 tomli 2.0.1 torch 2.3.1 torchtext 0.18.0 torchvision 0.18.1 tqdm 4.66.5 transformers 4.40.0 triton 2.3.1 typing_extensions 4.12.2 tzdata 2024.1 ultralytics 8.2.74 ultralytics-thop 2.0.0 unimernet 0.1.6 urllib3 2.2.2 visualdl 2.5.3 Wand 0.6.13 wcwidth 0.2.13 webdataset 0.2.86 Werkzeug 3.0.3 wheel 0.43.0 wordninja 2.0.0 xxhash 3.4.1 yacs 0.1.8 yarl 1.9.4 `

Jalen-Zhong avatar Aug 09 '24 04:08 Jalen-Zhong

我按照GPU文档里的第十条10. Enable CUDA Acceleration for OCR python -m pip install paddlepaddle-gpu==3.0.0b1 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/ 就可以运行了

cskkx1 avatar Aug 15 '24 09:08 cskkx1

我按照GPU文档里的第十条10. Enable CUDA Acceleration for OCR python -m pip install paddlepaddle-gpu==3.0.0b1 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/ 就可以运行了

我是做了这一步安装paddlepaddle-gpu之后出现Segmentation fault (core dumped), #748 会是cuda版本不一致(118/121)的问题吗?

[11/13 17:07:33 detectron2]: Environment info:
-------------------------------  ----------------------------------------------------------------------------------------
sys.platform                     linux
Python                           3.10.15 (main, Oct  3 2024, 07:27:34) [GCC 11.2.0]
numpy                            1.26.4
detectron2                       0.6 @/home/cxing/anaconda3/envs/MinerU/lib/python3.10/site-packages/detectron2
Compiler                         GCC 11.4
CUDA compiler                    not available
DETECTRON2_ENV_MODULE            <not set>
PyTorch                          2.3.1+cu121 @/home/cxing/anaconda3/envs/MinerU/lib/python3.10/site-packages/torch
PyTorch debug build              False
torch._C._GLIBCXX_USE_CXX11_ABI  False
GPU available                    Yes
GPU 0,1                          NVIDIA GeForce RTX 3090 (arch=8.6)
Driver version                   535.183.01
CUDA_HOME                        /usr/local/cuda
Pillow                           11.0.0
torchvision                      0.18.1+cu121 @/home/cxing/anaconda3/envs/MinerU/lib/python3.10/site-packages/torchvision
torchvision arch flags           5.0, 6.0, 7.0, 7.5, 8.0, 8.6, 9.0
fvcore                           0.1.5.post20221221
iopath                           0.1.9
cv2                              4.6.0
-------------------------------  ----------------------------------------------------------------------------------------
PyTorch built with:
  - GCC 9.3
  - C++ Version: 201703
  - Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v3.3.6 (Git Hash 86e6af5974177e513fd3fee58425e1063e7f1361)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 12.1
  - NVCC architecture flags: -gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90
  - CuDNN 8.9.3  (built against CUDA 11.8)
    - Built with CuDNN 8.9.2
  - Magma 2.6.1
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.1, CUDNN_VERSION=8.9.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=2.3.1, USE_CUDA=ON, USE_CUDNN=ON, USE_CUSPARSELT=1, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF,

[11/13 17:07:33 detectron2]: Command line arguments: {'config_file': '/home/cxing/anaconda3/envs/MinerU/lib/python3.10/site-packages/magic_pdf/resources/model_config/layoutlmv3/layoutlmv3_base_inference.yaml', 'resume': False, 'eval_only': False, 'num_gpus': 1, 'num_machines': 1, 'machine_rank': 0, 'dist_url': 'tcp://127.0.0.1:57823', 'opts': ['MODEL.WEIGHTS', '/home/cxing/.cache/modelscope/hub/opendatalab/PDF-Extract-Kit-1___0/models/Layout/LayoutLMv3/model_final.pth']}
[11/13 17:07:33 detectron2]: Contents of args.config_file=/home/cxing/anaconda3/envs/MinerU/lib/python3.10/site-packages/magic_pdf/resources/model_config/layoutlmv3/layoutlmv3_base_inference.yaml:
AUG:
  DETR: true
CACHE_DIR: ~/cache/huggingface
CUDNN_BENCHMARK: false
DATALOADER:
  ASPECT_RATIO_GROUPING: true
  FILTER_EMPTY_ANNOTATIONS: false
  NUM_WORKERS: 4
  REPEAT_THRESHOLD: 0.0
  SAMPLER_TRAIN: TrainingSampler
DATASETS:
  PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000
  PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000
  PROPOSAL_FILES_TEST: []
  PROPOSAL_FILES_TRAIN: []
  TEST:
  - scihub_train
  TRAIN:
  - scihub_train
GLOBAL:
  HACK: 1.0
ICDAR_DATA_DIR_TEST: ''
ICDAR_DATA_DIR_TRAIN: ''
INPUT:
  CROP:
    ENABLED: true
    SIZE:
    - 384
    - 600
    TYPE: absolute_range
  FORMAT: RGB
  MASK_FORMAT: polygon
  MAX_SIZE_TEST: 1333
  MAX_SIZE_TRAIN: 1333
  MIN_SIZE_TEST: 800
  MIN_SIZE_TRAIN:
  - 480
  - 512
  - 544
  - 576
  - 608
  - 640
  - 672
  - 704
  - 736
  - 768
  - 800
  MIN_SIZE_TRAIN_SAMPLING: choice
  RANDOM_FLIP: horizontal
MODEL:
  ANCHOR_GENERATOR:
    ANGLES:
    - - -90
      - 0
      - 90
    ASPECT_RATIOS:
    - - 0.5
      - 1.0
      - 2.0
    NAME: DefaultAnchorGenerator
    OFFSET: 0.0
    SIZES:
    - - 32
    - - 64
    - - 128
    - - 256
    - - 512
  BACKBONE:
    FREEZE_AT: 2
    NAME: build_vit_fpn_backbone
  CONFIG_PATH: ''
  DEVICE: cuda
  FPN:
    FUSE_TYPE: sum
    IN_FEATURES:
    - layer3
    - layer5
    - layer7
    - layer11
    NORM: ''
    OUT_CHANNELS: 256
  IMAGE_ONLY: true
  KEYPOINT_ON: false
  LOAD_PROPOSALS: false
  MASK_ON: true
  META_ARCHITECTURE: VLGeneralizedRCNN
  PANOPTIC_FPN:
    COMBINE:
      ENABLED: true
      INSTANCES_CONFIDENCE_THRESH: 0.5
      OVERLAP_THRESH: 0.5
      STUFF_AREA_LIMIT: 4096
    INSTANCE_LOSS_WEIGHT: 1.0
  PIXEL_MEAN:
  - 127.5
  - 127.5
  - 127.5
  PIXEL_STD:
  - 127.5
  - 127.5
  - 127.5
  PROPOSAL_GENERATOR:
    MIN_SIZE: 0
    NAME: RPN
  RESNETS:
    DEFORM_MODULATED: false
    DEFORM_NUM_GROUPS: 1
    DEFORM_ON_PER_STAGE:
    - false
    - false
    - false
    - false
    DEPTH: 50
    NORM: FrozenBN
    NUM_GROUPS: 1
    OUT_FEATURES:
    - res4
    RES2_OUT_CHANNELS: 256
    RES5_DILATION: 1
    STEM_OUT_CHANNELS: 64
    STRIDE_IN_1X1: true
    WIDTH_PER_GROUP: 64
  RETINANET:
    BBOX_REG_LOSS_TYPE: smooth_l1
    BBOX_REG_WEIGHTS:
    - 1.0
    - 1.0
    - 1.0
    - 1.0
    FOCAL_LOSS_ALPHA: 0.25
    FOCAL_LOSS_GAMMA: 2.0
    IN_FEATURES:
    - p3
    - p4
    - p5
    - p6
    - p7
    IOU_LABELS:
    - 0
    - -1
    - 1
    IOU_THRESHOLDS:
    - 0.4
    - 0.5
    NMS_THRESH_TEST: 0.5
    NORM: ''
    NUM_CLASSES: 10
    NUM_CONVS: 4
    PRIOR_PROB: 0.01
    SCORE_THRESH_TEST: 0.05
    SMOOTH_L1_LOSS_BETA: 0.1
    TOPK_CANDIDATES_TEST: 1000
  ROI_BOX_CASCADE_HEAD:
    BBOX_REG_WEIGHTS:
    - - 10.0
      - 10.0
      - 5.0
      - 5.0
    - - 20.0
      - 20.0
      - 10.0
      - 10.0
    - - 30.0
      - 30.0
      - 15.0
      - 15.0
    IOUS:
    - 0.5
    - 0.6
    - 0.7
  ROI_BOX_HEAD:
    BBOX_REG_LOSS_TYPE: smooth_l1
    BBOX_REG_LOSS_WEIGHT: 1.0
    BBOX_REG_WEIGHTS:
    - 10.0
    - 10.0
    - 5.0
    - 5.0
    CLS_AGNOSTIC_BBOX_REG: true
    CONV_DIM: 256
    FC_DIM: 1024
    NAME: FastRCNNConvFCHead
    NORM: ''
    NUM_CONV: 0
    NUM_FC: 2
    POOLER_RESOLUTION: 7
    POOLER_SAMPLING_RATIO: 0
    POOLER_TYPE: ROIAlignV2
    SMOOTH_L1_BETA: 0.0
    TRAIN_ON_PRED_BOXES: false
  ROI_HEADS:
    BATCH_SIZE_PER_IMAGE: 512
    IN_FEATURES:
    - p2
    - p3
    - p4
    - p5
    IOU_LABELS:
    - 0
    - 1
    IOU_THRESHOLDS:
    - 0.5
    NAME: CascadeROIHeads
    NMS_THRESH_TEST: 0.5
    NUM_CLASSES: 10
    POSITIVE_FRACTION: 0.25
    PROPOSAL_APPEND_GT: true
    SCORE_THRESH_TEST: 0.05
  ROI_KEYPOINT_HEAD:
    CONV_DIMS:
    - 512
    - 512
    - 512
    - 512
    - 512
    - 512
    - 512
    - 512
    LOSS_WEIGHT: 1.0
    MIN_KEYPOINTS_PER_IMAGE: 1
    NAME: KRCNNConvDeconvUpsampleHead
    NORMALIZE_LOSS_BY_VISIBLE_KEYPOINTS: true
    NUM_KEYPOINTS: 17
    POOLER_RESOLUTION: 14
    POOLER_SAMPLING_RATIO: 0
    POOLER_TYPE: ROIAlignV2
  ROI_MASK_HEAD:
    CLS_AGNOSTIC_MASK: false
    CONV_DIM: 256
    NAME: MaskRCNNConvUpsampleHead
    NORM: ''
    NUM_CONV: 4
    POOLER_RESOLUTION: 14
    POOLER_SAMPLING_RATIO: 0
    POOLER_TYPE: ROIAlignV2
  RPN:
    BATCH_SIZE_PER_IMAGE: 256
    BBOX_REG_LOSS_TYPE: smooth_l1
    BBOX_REG_LOSS_WEIGHT: 1.0
    BBOX_REG_WEIGHTS:
    - 1.0
    - 1.0
    - 1.0
    - 1.0
    BOUNDARY_THRESH: -1
    CONV_DIMS:
    - -1
    HEAD_NAME: StandardRPNHead
    IN_FEATURES:
    - p2
    - p3
    - p4
    - p5
    - p6
    IOU_LABELS:
    - 0
    - -1
    - 1
    IOU_THRESHOLDS:
    - 0.3
    - 0.7
    LOSS_WEIGHT: 1.0
    NMS_THRESH: 0.7
    POSITIVE_FRACTION: 0.5
    POST_NMS_TOPK_TEST: 1000
    POST_NMS_TOPK_TRAIN: 2000
    PRE_NMS_TOPK_TEST: 1000
    PRE_NMS_TOPK_TRAIN: 2000
    SMOOTH_L1_BETA: 0.0
  SEM_SEG_HEAD:
    COMMON_STRIDE: 4
    CONVS_DIM: 128
    IGNORE_VALUE: 255
    IN_FEATURES:
    - p2
    - p3
    - p4
    - p5
    LOSS_WEIGHT: 1.0
    NAME: SemSegFPNHead
    NORM: GN
    NUM_CLASSES: 10
  VIT:
    DROP_PATH: 0.1
    IMG_SIZE:
    - 224
    - 224
    NAME: layoutlmv3_base
    OUT_FEATURES:
    - layer3
    - layer5
    - layer7
    - layer11
    POS_TYPE: abs
  WEIGHTS:
OUTPUT_DIR:
SCIHUB_DATA_DIR_TRAIN: ~/publaynet/layout_scihub/train
SEED: 42
SOLVER:
  AMP:
    ENABLED: true
  BACKBONE_MULTIPLIER: 1.0
  BASE_LR: 0.0002
  BIAS_LR_FACTOR: 1.0
  CHECKPOINT_PERIOD: 2000
  CLIP_GRADIENTS:
    CLIP_TYPE: full_model
    CLIP_VALUE: 1.0
    ENABLED: true
    NORM_TYPE: 2.0
  GAMMA: 0.1
  GRADIENT_ACCUMULATION_STEPS: 1
  IMS_PER_BATCH: 32
  LR_SCHEDULER_NAME: WarmupCosineLR
  MAX_ITER: 20000
  MOMENTUM: 0.9
  NESTEROV: false
  OPTIMIZER: ADAMW
  REFERENCE_WORLD_SIZE: 0
  STEPS:
  - 10000
  WARMUP_FACTOR: 0.01
  WARMUP_ITERS: 333
  WARMUP_METHOD: linear
  WEIGHT_DECAY: 0.05
  WEIGHT_DECAY_BIAS: null
  WEIGHT_DECAY_NORM: 0.0
TEST:
  AUG:
    ENABLED: false
    FLIP: true
    MAX_SIZE: 4000
    MIN_SIZES:
    - 400
    - 500
    - 600
    - 700
    - 800
    - 900
    - 1000
    - 1100
    - 1200
  DETECTIONS_PER_IMAGE: 100
  EVAL_PERIOD: 1000
  EXPECTED_RESULTS: []
  KEYPOINT_OKS_SIGMAS: []
  PRECISE_BN:
    ENABLED: false
    NUM_ITER: 200
VERSION: 2
VIS_PERIOD: 0

[11/13 17:07:35 d2.checkpoint.detection_checkpoint]: [DetectionCheckpointer] Loading from /home/cxing/.cache/modelscope/hub/opendatalab/PDF-Extract-Kit-1___0/models/Layout/LayoutLMv3/model_final.pth ...
[11/13 17:07:35 fvcore.common.checkpoint]: [Checkpointer] Loading from /home/cxing/.cache/modelscope/hub/opendatalab/PDF-Extract-Kit-1___0/models/Layout/LayoutLMv3/model_final.pth ...
2024-11-13 17:07:37.598 | INFO     | magic_pdf.model.pdf_extract_kit:__init__:302 - DocAnalysis init done!
2024-11-13 17:07:37.598 | INFO     | magic_pdf.model.doc_analyze_by_custom_model:custom_model_init:131 - model init cost: 23.59622550010681


--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
0   at::_ops::conv2d::call(at::Tensor const&, at::Tensor const&, std::optional<at::Tensor> const&, c10::ArrayRef<c10::SymInt>, c10::ArrayRef<c10::SymInt>, c10::ArrayRef<c10::SymInt>, c10::SymInt)
1   at::native::conv2d_symint(at::Tensor const&, at::Tensor const&, std::optional<at::Tensor> const&, c10::ArrayRef<c10::SymInt>, c10::ArrayRef<c10::SymInt>, c10::ArrayRef<c10::SymInt>, c10::SymInt)
2   at::_ops::convolution::call(at::Tensor const&, at::Tensor const&, std::optional<at::Tensor> const&, c10::ArrayRef<c10::SymInt>, c10::ArrayRef<c10::SymInt>, c10::ArrayRef<c10::SymInt>, bool, c10::ArrayRef<c10::SymInt>, c10::SymInt)
3   at::_ops::convolution::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, std::optional<at::Tensor> const&, c10::ArrayRef<c10::SymInt>, c10::ArrayRef<c10::SymInt>, c10::ArrayRef<c10::SymInt>, bool, c10::ArrayRef<c10::SymInt>, c10::SymInt)
4   at::native::convolution(at::Tensor const&, at::Tensor const&, std::optional<at::Tensor> const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool, c10::ArrayRef<long>, long)
5   at::_ops::_convolution::call(at::Tensor const&, at::Tensor const&, std::optional<at::Tensor> const&, c10::ArrayRef<c10::SymInt>, c10::ArrayRef<c10::SymInt>, c10::ArrayRef<c10::SymInt>, bool, c10::ArrayRef<c10::SymInt>, c10::SymInt, bool, bool, bool, bool)
6   at::native::_convolution(at::Tensor const&, at::Tensor const&, std::optional<at::Tensor> const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool, c10::ArrayRef<long>, long, bool, bool, bool, bool)
7   at::_ops::cudnn_convolution::call(at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::ArrayRef<c10::SymInt>, c10::ArrayRef<c10::SymInt>, c10::SymInt, bool, bool, bool)
8   at::native::cudnn_convolution(at::Tensor const&, at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, long, bool, bool, bool)

----------------------
Error Message Summary:
----------------------
FatalError: `Segmentation fault` is detected by the operating system.
  [TimeInfo: *** Aborted at 1731488858 (unix time) try "date -d @1731488858" if you are using GNU date ***]
  [SignalInfo: *** SIGSEGV (@0x20000002ef4) received by PID 7431 (TID 0x7c23f0616740) from PID 12020 ***]

Segmentation fault (core dumped)

loveritsu929re avatar Nov 13 '24 09:11 loveritsu929re