dwata
dwata copied to clipboard
Import Markdown files
As a user I would like to import Markdown files, for example from my company documentation. A folder containing Markdown files should become a data source.
I will look at three Rust crates for this and select one that has the friendliest API:
- https://github.com/wooorm/markdown-rs
- https://github.com/kivikakk/comrak
- https://github.com/pulldown-cmark/pulldown-cmark
We need support to traverse the AST.
We need to import content into some kind of structure which helps in generating embeddings. We also need to track the source file and paragraph from emebeddings so sections can be sent to AI models.
I am using Comrak as the Rust library for this.