drunkpig

Results 91 comments of drunkpig

@freedom1993 Please upload your pdf to help us improve functions.

@freedom1993 turn on OCR flag , --method ocr

@freedom1993 please upload your pdf file

> 不想要图片怎么设置呢,只想要图片里面的一些文字信息 in the output directory find XXX_content_list.json, concat all elements with ignoring `type` equals `images` or `table`

In fact, both EPUB and MOBI are compiled from HTML. Our approach is to use the tool Calibre to convert EPUB and MOBI into HTML, and then use the project...

[magic_pdf-0.6.2b1-released](https://github.com/opendatalab/MinerU/releases/tag/magic_pdf-0.6.2b1-released) solved this error.

@wumaotegan complicated_layout tag indicates that the layout of this page is complex and may have issues with text order. However, this tag does not determine whether OCR should be initiated.

@wumaotegan Can you provide your pdfs to help us improve model.

@Invariant0502 "pdf_path" in command line is a real path to a pdf file in your disk. For example : ```shell magicpdf --pdf /mnt/data/mybook.pdf --inside_model true ```