kotaemon icon indicating copy to clipboard operation
kotaemon copied to clipboard

[REQUEST] markitdown for file parsing

Open robomotic opened this issue 11 months ago • 0 comments

Reference Issues

No response

Summary

This convert all files into Markdown format. https://github.com/microsoft/markitdown Maybe it can act as a pre-processor and then docling takes as input the markdown instead of the original? It should be a user option.

Basic Example

from markitdown import MarkItDown

md = MarkItDown() result = md.convert("test.xlsx") print(result.text_content)

Drawbacks

New library just published 3 weeks ago.

Additional information

No response

robomotic avatar Jan 12 '25 09:01 robomotic