(MinerU) llw@lianyan:~/github/marker/pdf_marker/workspace2/pdf$ magic-pdf -p small_ocr.pdf
2024-08-14 08:33:14.220 | INFO | magic_pdf.libs.pdf_check:detect_invalid_chars:57 - cid_count: 0, text_len: 8, cid_chars_radio: 0.0
2024-08-14 08:33:14.221 | WARNING | magic_pdf.filter.pdf_classify_by_type:classify:334 - pdf is not classified by area and text_len, by_image_area: False, by_text: False, by_avg_words: False, by_img_num: True, by_text_layout: False, by_img_narrow_strips: False, by_invalid_chars: True
2024-08-14 08:33:14.240 | ERROR | magic_pdf.model.pdf_extract_kit::27 - libnccl.so.2: cannot open shared object file: No such file or directory
Traceback (most recent call last):
File "/home/llw/miniconda3/envs/MinerU/bin/magic-pdf", line 8, in
sys.exit(cli())
โ โ โ <Command cli>
โ โ
โ <module 'sys' (built-in)>
File "/home/llw/miniconda3/envs/MinerU/lib/python3.10/site-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
โ โ โ โ {}
โ โ โ ()
โ โ <function BaseCommand.main at 0x7d6fd1a1a200>
โ <Command cli>
File "/home/llw/miniconda3/envs/MinerU/lib/python3.10/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
โ โ โ <click.core.Context object at 0x7d6fd1d56d40>
โ โ <function Command.invoke at 0x7d6fd1a1acb0>
โ <Command cli>
File "/home/llw/miniconda3/envs/MinerU/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
โ โ โ โ โ โ {'path': 'small_ocr.pdf', 'output_dir': '', 'method': 'auto'}
โ โ โ โ โ <click.core.Context object at 0x7d6fd1d56d40>
โ โ โ โ <function cli at 0x7d6f5f338700>
โ โ โ <Command cli>
โ โ <function Context.invoke at 0x7d6fd1a19a20>
โ <click.core.Context object at 0x7d6fd1d56d40>
File "/home/llw/miniconda3/envs/MinerU/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
โ โ {'path': 'small_ocr.pdf', 'output_dir': '', 'method': 'auto'}
โ ()
File "/home/llw/miniconda3/envs/MinerU/lib/python3.10/site-packages/magic_pdf/tools/cli.py", line 75, in cli
parse_doc(path)
โ โ 'small_ocr.pdf'
โ <function cli..parse_doc at 0x7d6fd1c33490>
File "/home/llw/miniconda3/envs/MinerU/lib/python3.10/site-packages/magic_pdf/tools/cli.py", line 60, in parse_doc
do_parse(
โ <function do_parse at 0x7d6f5f323be0>
File "/home/llw/miniconda3/envs/MinerU/lib/python3.10/site-packages/magic_pdf/tools/common.py", line 65, in do_parse
pipe.pipe_analyze()
โ โ <function UNIPipe.pipe_analyze at 0x7d6f5f323880>
โ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x7d6f5f328d00>
File "/home/llw/miniconda3/envs/MinerU/lib/python3.10/site-packages/magic_pdf/pipe/UNIPipe.py", line 31, in pipe_analyze
self.model_list = doc_analyze(self.pdf_bytes, ocr=True)
โ โ โ โ โ b'%PDF-1.7\r\n%\xa1\xb3\xc5\xd7\r\n1 0 obj\r\n<</Pages 2 0 R /Type/Catalog>>\r\nendobj\r\n2 0 obj\r\n<</Count 8/Kids[ 4 0 R ...
โ โ โ โ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x7d6f5f328d00>
โ โ โ <function doc_analyze at 0x7d6fcc9d68c0>
โ โ []
โ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x7d6f5f328d00>
File "/home/llw/miniconda3/envs/MinerU/lib/python3.10/site-packages/magic_pdf/model/doc_analyze_by_custom_model.py", line 109, in doc_analyze
custom_model = model_manager.get_model(ocr, show_log)
โ โ โ โ False
โ โ โ True
โ โ <function ModelSingleton.get_model at 0x7d6fcc9d6830>
โ <magic_pdf.model.doc_analyze_by_custom_model.ModelSingleton object at 0x7d6f5ee28460>
File "/home/llw/miniconda3/envs/MinerU/lib/python3.10/site-packages/magic_pdf/model/doc_analyze_by_custom_model.py", line 63, in get_model
self._models[key] = custom_model_init(ocr=ocr, show_log=show_log)
โ โ โ โ โ โ False
โ โ โ โ โ True
โ โ โ โ <function custom_model_init at 0x7d6fcc9d6710>
โ โ โ (True, False)
โ โ {}
โ <magic_pdf.model.doc_analyze_by_custom_model.ModelSingleton object at 0x7d6f5ee28460>
File "/home/llw/miniconda3/envs/MinerU/lib/python3.10/site-packages/magic_pdf/model/doc_analyze_by_custom_model.py", line 83, in custom_model_init
from magic_pdf.model.pdf_extract_kit import CustomPEKModel
File "", line 1027, in _find_and_load
File "", line 1006, in _find_and_load_unlocked
File "", line 688, in _load_unlocked
File "", line 883, in exec_module
File "", line 241, in _call_with_frames_removed
File "/home/llw/miniconda3/envs/MinerU/lib/python3.10/site-packages/magic_pdf/model/pdf_extract_kit.py", line 13, in
import torch
File "/home/llw/miniconda3/envs/MinerU/lib/python3.10/site-packages/torch/init.py", line 239, in
from torch._C import * # noqa: F403
ImportError: libnccl.so.2: cannot open shared object file: No such file or directory
2024-08-14 08:33:14.246 | ERROR | magic_pdf.model.pdf_extract_kit::28 - Required dependency not installed, please install by
"pip install magic-pdf[full] --extra-index-url https://myhloli.github.io/wheels/"
ๅๆ ท็ๆญฅ้ชคๅๆ ท็็ฏๅขๆไนๆฏๆฅ่ฟไธช้่ฏฏ
@hzzheng0612 @lianyant Have you installed NCCL? https://developer.nvidia.com/nccl
Centos7 python3.10 CPUๆจกๅผไธไนๆฏ่ฟไธชๆฅ้๏ผ็ๆฌไธบ0.7.0b1
Centos7 python3.10 CPUๆจกๅผไธไนๆฏ่ฟไธชๆฅ้๏ผ็ๆฌไธบ0.7.0b1
ๆ่ฟ่พนๆๆฅๅๅ็ฐๆฏ็ผบๅคฑOpenGL่ฟๅ ไธชๅบๅฏผ่ด็๏ผ้่ฟyum -y install epel-release
&& yum -y install mesa-libGL mesa-libGLU libXtst libXrender ่กฅๅ
ไธ่ฟๅ ไธชไพ่ตๅๅฐฑๅฏไปฅๆญฃๅธธ่ฟ่กไบ
@hzzheng0612 @lianyant Have you installed NCCL? https://developer.nvidia.com/nccl
ๆๅฎ่ฃ
ไบ่ฟไธชไนๅๅฐฑๆญฃๅธธไบ๏ผ้ๅธธๆ่ฐข @myhloli