olmocr
olmocr copied to clipboard
fix jsonl_to_markdown
Fixes #162
Changes proposed in this pull request:
- Add
jsonl_to_markdown.pyutility script to convert JSONL files to Markdown format - Script extracts the 'text' field from each line in a JSONL file and saves it as a separate Markdown file
- Include proper error handling for JSON decoding and file operations
- Add example usage and documentation
Before submitting
- [x] I've read and followed all steps in the Making a pull request section of the
CONTRIBUTINGdocs. - [x] I've updated or added relevant docstrings following the syntax described in the Writing docstrings section of the
CONTRIBUTINGdocs. - [x] If this PR fixes a bug, I've added a test that will fail without my fix.
- [x] If this PR adds a new feature, I've added tests that sufficiently cover my new functionality.
fixes #162