marker icon indicating copy to clipboard operation
marker copied to clipboard

Problem: Loss of Equations Between Paragraphs in PDF to Markdown Conversion

Open nunamia opened this issue 1 year ago • 0 comments

I have some questions about the implementation. Can the PDF to Markdown conversion with Marker include marking the coordinate information of each paragraph in Markdown with the 'bbox': (x0, y0, x1, y1) format report with layout.json ext. PyMuPDF provides data containing this information.

There seems to be a significant error in recognizing and storing equations. For example, formulas within the text or between paragraphs are being lost. How can this iss ue be addressed?

[Attachment Included]

3.pdf

nunamia avatar Jan 23 '24 09:01 nunamia