markitdown icon indicating copy to clipboard operation
markitdown copied to clipboard

Python tool for converting files and office documents to Markdown.

Results 345 markitdown issues
Sort by recently updated
recently updated
newest added

When trying to convert a German pdf. I get this Error: `UnicodeEncodeError: 'charmap' codec can't encode character '\u2212' in position 15215: character maps to `

Cannot figure out why this is failing and nothing is, it's driving me crazy. UPDATE: To be fair I only tried the xls from tests, but that should work. Images,...

Using `pymupdf4llm` instead of `pdfminer` to parse pdf contents into markdown formats, as suggested by #131. Pros and Cons: - `pdfminer` extract texts only, generated files have no heading, titles,...

My take on how this code should support async, given all underlying libraries are not supporting async. For more details/context, see: https://github.com/microsoft/markitdown/issues/13#issuecomment-2543834157

awaiting op response

the proposed PR addresses somehow issue #34. Having not found a suitable python library, I added a JsonConverter class independent of the PlainTextConverter. in a nutshell : - parse document...

I tried to extract the contents of pdf. But it is extracting as plain text, not as markdown. Am I missing any parameter? from markitdown import MarkItDown md = MarkItDown()...

When I used Markitdown to parse an XLSX file with a size of 80 megabytes (containing over 4.6 million rows of data), the program ran for eight hours and then...

![Image](https://github.com/user-attachments/assets/6c194d40-2ada-40a2-8939-c1004e2630eb) 在执行批量读xlsx时,遇到损坏文件直接会中止程序。原因是这里没有raise异常。我的外层无法捕获到异常。

Hi there, it does not work with persian language Results in: [markdown_test.md](https://github.com/user-attachments/files/18279834/%2B.%2B.%2B.%2B.%2B.%2B.%2B.%2B.md)

> > [@gagb](https://github.com/gagb) Would be great to have this as an example in the README! Thanks. > > Agreed. IMO, a PDF based example would be best where it is...

enhancement
open for contribution