MinerU
MinerU copied to clipboard
It is impossible to start magic-pdf --version using the command line, and a TypeError is reported.
Description of the bug | 错误描述
Window10, conda, failed to run the command-line demo, prompt
(MinerU) C:\Users\tu_ha>magic-pdf --version
Traceback (most recent call last):
File "D:\0_dev\anaconda\envs\MinerU\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "D:\0_dev\anaconda\envs\MinerU\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "D:\0_dev\anaconda\envs\MinerU\Scripts\magic-pdf.exe\__main__.py", line 4, in <module>
from magic_pdf.cli.magicpdf import cli
File "D:\0_dev\anaconda\envs\MinerU\lib\site-packages\magic_pdf\cli\magicpdf.py", line 33, in <module>
from magic_pdf.pipe.UNIPipe import UNIPipe
File "D:\0_dev\anaconda\envs\MinerU\lib\site-packages\magic_pdf\pipe\UNIPipe.py", line 11, in <module>
from magic_pdf.user_api import parse_union_pdf, parse_ocr_pdf
File "D:\0_dev\anaconda\envs\MinerU\lib\site-packages\magic_pdf\user_api.py", line 21, in <module>
from magic_pdf.pdf_parse_by_ocr import parse_pdf_by_ocr
File "D:\0_dev\anaconda\envs\MinerU\lib\site-packages\magic_pdf\pdf_parse_by_ocr.py", line 1, in <module>
from magic_pdf.pdf_parse_union_core import pdf_parse_union
File "D:\0_dev\anaconda\envs\MinerU\lib\site-packages\magic_pdf\pdf_parse_union_core.py", line 13, in <module>
from magic_pdf.para.para_split_v2 import para_split
File "D:\0_dev\anaconda\envs\MinerU\lib\site-packages\magic_pdf\para\para_split_v2.py", line 1, in <module>
from sklearn.cluster import DBSCAN
File "D:\0_dev\anaconda\envs\MinerU\lib\site-packages\sklearn\__init__.py", line 84, in <module>
from .base import clone
File "D:\0_dev\anaconda\envs\MinerU\lib\site-packages\sklearn\base.py", line 19, in <module>
from .utils._estimator_html_repr import _HTMLDocumentationLinkMixin, estimator_html_repr
File "D:\0_dev\anaconda\envs\MinerU\lib\site-packages\sklearn\utils\__init__.py", line 11, in <module>
from ._chunking import gen_batches, gen_even_slices
File "D:\0_dev\anaconda\envs\MinerU\lib\site-packages\sklearn\utils\_chunking.py", line 8, in <module>
from ._param_validation import Interval, validate_params
File "D:\0_dev\anaconda\envs\MinerU\lib\site-packages\sklearn\utils\_param_validation.py", line 14, in <module>
from .validation import _is_arraylike_not_scalar
File "D:\0_dev\anaconda\envs\MinerU\lib\site-packages\sklearn\utils\validation.py", line 26, in <module>
from ..utils._array_api import _asarray_with_order, _is_numpy_namespace, get_namespace
File "D:\0_dev\anaconda\envs\MinerU\lib\site-packages\sklearn\utils\_array_api.py", line 11, in <module>
from .fixes import parse_version
File "D:\0_dev\anaconda\envs\MinerU\lib\site-packages\sklearn\utils\fixes.py", line 20, in <module>
import scipy.stats
File "D:\0_dev\anaconda\envs\MinerU\lib\site-packages\scipy\stats\__init__.py", line 610, in <module>
from ._stats_py import *
File "D:\0_dev\anaconda\envs\MinerU\lib\site-packages\scipy\stats\_stats_py.py", line 49, in <module>
from . import distributions
File "D:\0_dev\anaconda\envs\MinerU\lib\site-packages\scipy\stats\distributions.py", line 10, in <module>
from . import _continuous_distns
File "D:\0_dev\anaconda\envs\MinerU\lib\site-packages\scipy\stats\_continuous_distns.py", line 12, in <module>
from scipy.interpolate import BSpline
File "D:\0_dev\anaconda\envs\MinerU\lib\site-packages\scipy\interpolate\__init__.py", line 167, in <module>
from ._interpolate import *
File "D:\0_dev\anaconda\envs\MinerU\lib\site-packages\scipy\interpolate\_interpolate.py", line 12, in <module>
from . import _fitpack_py
File "D:\0_dev\anaconda\envs\MinerU\lib\site-packages\scipy\interpolate\_fitpack_py.py", line 8, in <module>
from ._fitpack_impl import bisplrep, bisplev, dblint # noqa: F401
File "D:\0_dev\anaconda\envs\MinerU\lib\site-packages\scipy\interpolate\_fitpack_impl.py", line 103, in <module>
'iwrk': array([], dfitpack_int), 'u': array([], float),
TypeError
How to reproduce the bug | 如何复现
The command lines that have been run include :pip install magic-pdf[full-cpu]
,pip install detectron2 --extra-index-url https://myhloli.github.io/wheels/
The existing dependencies at present
(MinerU) C:\Users\tu_ha>pip list
Package Version
------------------------- ------------------
absl-py 2.1.0
aiohttp 3.9.5
aiosignal 1.3.1
albucore 0.0.12
albumentations 1.4.12
altair 5.3.0
annotated-types 0.7.0
antlr4-python3-runtime 4.9.3
anyio 4.4.0
argon2-cffi 23.1.0
argon2-cffi-bindings 21.2.0
arrow 1.3.0
astor 0.8.1
asttokens 2.4.1
async-lru 2.0.4
async-timeout 4.0.3
attrdict 2.0.1
attrs 23.2.0
Babel 2.15.0
bce-python-sdk 0.9.17
beautifulsoup4 4.12.3
black 24.4.2
bleach 6.1.0
blinker 1.8.2
boto3 1.34.149
botocore 1.34.149
braceexpand 0.1.7
Brotli 1.1.0
cachetools 5.4.0
certifi 2024.7.4
cffi 1.16.0
charset-normalizer 3.3.2
click 8.1.7
cloudpickle 3.0.0
colorama 0.4.6
colorlog 6.8.2
comm 0.2.2
contourpy 1.2.1
cryptography 43.0.0
cssselect 1.2.0
cssutils 2.11.1
cycler 0.12.1
Cython 3.0.10
datasets 2.20.0
debugpy 1.8.2
decorator 5.1.1
defusedxml 0.7.1
detectron2 0.6
dill 0.3.8
et-xmlfile 1.1.0
eva-decord 0.6.1
eval_type_backport 0.2.0
evaluate 0.4.2
exceptiongroup 1.2.2
executing 2.0.1
fairscale 0.4.13
fast-langdetect 0.2.1
fastjsonschema 2.20.0
fasttext-wheel 0.9.2
filelock 3.15.4
fire 0.6.0
Flask 3.0.3
flask-babel 4.0.0
fonttools 4.53.1
fqdn 1.5.1
frozenlist 1.4.1
fsspec 2024.5.0
ftfy 6.2.0
future 1.0.0
fvcore 0.1.5.post20221221
gitdb 4.0.11
GitPython 3.1.43
grpcio 1.65.1
h11 0.14.0
httpcore 1.0.5
httpx 0.27.0
huggingface-hub 0.24.2
hydra-core 1.3.2
idna 3.7
imageio 2.34.2
imgaug 0.4.0
intel-openmp 2021.4.0
iopath 0.1.9
ipykernel 6.29.5
ipython 8.26.0
isoduration 20.11.0
itsdangerous 2.2.0
jedi 0.19.1
Jinja2 3.1.4
jmespath 1.0.1
joblib 1.4.2
json5 0.9.25
jsonpointer 3.0.0
jsonschema 4.23.0
jsonschema-specifications 2023.12.1
jupyter_client 8.6.2
jupyter_core 5.7.2
jupyter-events 0.10.0
jupyter-lsp 2.2.5
jupyter_server 2.14.2
jupyter_server_terminals 0.5.3
jupyterlab 4.2.4
jupyterlab_pygments 0.3.0
jupyterlab_server 2.27.3
kiwisolver 1.4.5
lazy_loader 0.4
lmdb 1.5.1
loguru 0.7.2
lxml 5.2.2
magic-pdf 0.6.1
Markdown 3.6
markdown-it-py 3.0.0
MarkupSafe 2.1.5
matplotlib 3.9.1
matplotlib-inline 0.1.7
mdurl 0.1.2
mistune 3.0.2
mkl 2021.4.0
more-itertools 10.3.0
mpmath 1.3.0
multidict 6.0.5
multiprocess 0.70.16
mypy-extensions 1.0.0
nbclient 0.10.0
nbconvert 7.16.4
nbformat 5.10.4
nest-asyncio 1.6.0
networkx 3.3
nltk 3.8.1
notebook_shim 0.2.4
numpy 1.26.4
omegaconf 2.3.0
opencv-contrib-python 4.6.0.66
opencv-python 4.6.0.66
opencv-python-headless 4.10.0.84
openpyxl 3.1.5
opt-einsum 3.3.0
overrides 7.7.0
packaging 24.1
paddleocr 2.7.3
paddlepaddle 2.6.1
pandas 2.2.2
pandocfilters 1.5.1
parso 0.8.4
pathspec 0.12.1
pdf2docx 0.5.8
pdf2image 1.17.0
pdfminer.six 20240706
pillow 10.4.0
pip 24.0
platformdirs 4.2.2
portalocker 2.10.1
premailer 3.10.0
prometheus_client 0.20.0
prompt_toolkit 3.0.47
protobuf 3.20.2
psutil 6.0.0
pure_eval 0.2.3
py-cpuinfo 9.0.0
pyarrow 17.0.0
pyarrow-hotfix 0.6
pybind11 2.13.1
pyclipper 1.3.0.post5
pycocotools 2.0.8
pycparser 2.22
pycryptodome 3.20.0
pydantic 2.8.2
pydantic_core 2.20.1
pydeck 0.9.1
Pygments 2.18.0
PyMuPDF 1.24.9
PyMuPDFb 1.24.9
pyparsing 3.1.2
pypdfium2 4.30.0
python-dateutil 2.9.0.post0
python-docx 1.1.2
python-json-logger 2.0.7
pytz 2024.1
pywin32 306
pywinpty 2.0.13
PyYAML 6.0.1
pyzmq 26.0.3
rapidfuzz 3.9.4
rarfile 4.2
referencing 0.35.1
regex 2024.7.24
requests 2.32.3
rfc3339-validator 0.1.4
rfc3986-validator 0.1.1
rich 13.7.1
robust-downloader 0.0.2
rpds-py 0.19.1
s3transfer 0.10.2
safetensors 0.4.3
scikit-image 0.24.0
scikit-learn 1.5.1
scipy 1.14.0
seaborn 0.13.2
Send2Trash 1.8.3
setuptools 69.5.1
shapely 2.0.5
six 1.16.0
smmap 5.0.1
sniffio 1.3.1
soupsieve 2.5
stack-data 0.6.3
streamlit 1.37.0
streamlit-drawable-canvas 0.9.3
sympy 1.13.1
tabulate 0.9.0
tbb 2021.13.0
tenacity 8.5.0
tensorboard 2.17.0
tensorboard-data-server 0.7.2
termcolor 2.4.0
terminado 0.18.1
threadpoolctl 3.5.0
tifffile 2024.7.24
timm 0.9.16
tinycss2 1.3.0
tokenizers 0.19.1
toml 0.10.2
tomli 2.0.1
toolz 0.12.1
torch 2.3.1
torchtext 0.18.0
torchvision 0.18.1
tornado 6.4.1
tqdm 4.66.4
traitlets 5.14.3
transformers 4.40.0
types-python-dateutil 2.9.0.20240316
typing_extensions 4.12.2
tzdata 2024.1
ultralytics 8.2.67
ultralytics-thop 2.0.0
unimernet 0.1.1
uri-template 1.3.0
urllib3 2.2.2
visualdl 2.5.3
Wand 0.6.13
watchdog 4.0.1
wcwidth 0.2.13
webcolors 24.6.0
webdataset 0.2.86
webencodings 0.5.1
websocket-client 1.8.0
Werkzeug 3.0.3
wheel 0.43.0
win32-setctime 1.1.0
wordninja 2.0.0
xxhash 3.4.1
yacs 0.1.8
yarl 1.9.4
Operating system | 操作系统
Windows
Python version | Python 版本
3.10
Software version | 软件版本 (magic-pdf --version)
0.6.x
Device mode | 设备模式
cpu