marker
marker copied to clipboard
Feature request: include page numbers
Hello,
This library has been very effective for my use case in creating a RAG pipeline for unstructured docs. However, for us to commit we need to be able to cite pages where data was retrieved. Maybe it could be an optional param, and the output could be a page number as a markdown header encapsulating the other content?
Thanks!
You might take a look at Llmsherpa/nlm-ingestor for page-numbering https://github.com/nlmatics/nlm-ingestor
Fix in latest