olmocr icon indicating copy to clipboard operation
olmocr copied to clipboard

fix jsonl_to_markdown

Open ved1beta opened this issue 8 months ago • 0 comments

Fixes #162

Changes proposed in this pull request:

  • Add jsonl_to_markdown.py utility script to convert JSONL files to Markdown format
  • Script extracts the 'text' field from each line in a JSONL file and saves it as a separate Markdown file
  • Include proper error handling for JSON decoding and file operations
  • Add example usage and documentation

Before submitting

  • [x] I've read and followed all steps in the Making a pull request section of the CONTRIBUTING docs.
  • [x] I've updated or added relevant docstrings following the syntax described in the Writing docstrings section of the CONTRIBUTING docs.
  • [x] If this PR fixes a bug, I've added a test that will fail without my fix.
  • [x] If this PR adds a new feature, I've added tests that sufficiently cover my new functionality.

fixes #162

ved1beta avatar Apr 07 '25 20:04 ved1beta