Results 4 issues of [email protected]

目前版本(0.8.1)解析的pdf文档,如果是三栏布局,解析结果会存在段落错乱的问题, ![image](https://github.com/user-attachments/assets/55a6c6bc-6fa0-485a-a8fd-5dcc1237642d) 部分运行日志: 2024-09-14 10:20:35.811 | INFO | magic_pdf.model.pdf_extract_kit:__call__:289 - formula nums: 2, mfr time: 0.33 2024-09-14 10:20:36.460 | INFO | magic_pdf.model.pdf_extract_kit:__call__:372 - ocr cost: 0.65 2024-09-14 10:20:36.461 | INFO...

enhancement

### Description of the bug | 错误描述 使用0.9.0版本识别多栏版面文档识别的阅读顺序不正确 ### How to reproduce the bug | 如何复现 源pdf文档见附件 [14-美国“马赛克战”作战概念解析_雷子欣.pdf](https://github.com/user-attachments/files/17676559/14-._.pdf) 识别的版面阅读顺序不正确的截图 ![bdb8fd0fd49940fdbae97364999172f3](https://github.com/user-attachments/assets/99e25e0d-8c99-4bfb-b0b1-4dca3ed49e14) ![image](https://github.com/user-attachments/assets/2c09627d-0ede-4989-bcc3-75ae59626a40) ### Operating system | 操作系统 Linux ### Python version...

bug

### Description of the bug | 错误描述 在同一页中相同的公式(符号)一处被识别为公式一处别识别为文本。 ### How to reproduce the bug | 如何复现 在同一页中相同的公式(符号)一处被识别为公式一处别识别为文本。如: ![Image](https://github.com/user-attachments/assets/deb64ed0-ce7f-4129-b5eb-f18441344870) ![Image](https://github.com/user-attachments/assets/c9bd2e2c-67cd-45b7-9462-efd44c289fdb) ![Image](https://github.com/user-attachments/assets/6503807c-a25b-46dc-a97f-0b71cda43627) 解析后的markdown效果: ![Image](https://github.com/user-attachments/assets/81dead0d-9bee-4f29-b4cd-2b4013b7f17e) 源pdf文档请见 [test01_origin.pdf](https://github.com/user-attachments/files/18517899/test01_origin.pdf) span结果 [test01_spans.pdf](https://github.com/user-attachments/files/18517955/test01_spans.pdf) ### Operating system |...

bug

### 🔎 Search before asking | 提交之前请先搜索 - [x] I have searched the MinerU [Readme](https://github.com/opendatalab/MinerU) and found no similar bug report. - [x] I have searched the MinerU [Issues](https://github.com/opendatalab/MinerU/issues) and...

bug