Xiaomeng Zhao comments

Results 690 comments of


                                            Xiaomeng Zhao

报错ValueError: Unable to avoid copy while creating an array as requested.

要不你建个新的conda环境从头走一遍再试试？

显存长期驻留

是不是改成服务之后一直在后台有进程驻留，任务完成后需要关闭进程以完成显存释放。

demo.py中如何像magic-pdf pdf-command [OPTIONS]中支持ocr、txt、auto的模式选择

https://github.com/opendatalab/MinerU/blob/4983bc1df668b80fa3481fa657eb509b448bb082/demo/demo.py#L20 给"_pdf_type"赋值，可以赋值为"ocr"或"txt"，对应命令行中的ocr和txt方式，同时需要注释掉25行的pipe.pipe_classify()方法。如果不注释掉25行，就还是auto模式，注释掉的话就是_pdf_type中指定的模式。

关闭公式解析

https://github.com/opendatalab/MinerU/blob/7cdf88c668f90c7a97821d5f26f10340dd8f5000/magic_pdf/model/doc_analyze_by_custom_model.py#L88 在这个字典结构的末尾增加一条`"apply_formula": False`即可

关闭公式解析

> > https://github.com/opendatalab/MinerU/blob/7cdf88c668f90c7a97821d5f26f10340dd8f5000/magic_pdf/model/doc_analyze_by_custom_model.py#L88 > > > > 在这个字典结构的末尾增加一条`"apply_formula": False`即可 > > 请问这样的话公式是怎么识别呢，是用OCR吗？还是说这样就不再识别公式了呢？这样处理之后就不识别公式了

支持自动将图片上传到s3

是支持这个功能的，可以参考readme中的 https://github.com/opendatalab/MinerU/tree/master#api ```python image_dir = "s3://img_bucket/" s3image_cli = S3ReaderWriter(img_ak, img_sk, img_endpoint, parent_path=image_dir) ... pipe = UNIPipe(pdf_bytes, jso_useful_key, s3image_cli) ``` 这样解析的图片会自动上传到`s3://img_bucket/` 中，生成的markdown中image标签也会是拼装好的s3路径。

Xiaomeng Zhao

报错ValueError: Unable to avoid copy while creating an array as requested.

显存长期驻留

demo.py中如何像magic-pdf pdf-command [OPTIONS]中支持ocr、txt、auto的模式选择

关闭公式解析

关闭公式解析

支持自动将图片上传到s3

支持自动将图片上传到s3

ocr加速

ocr加速

Layout Bug