olmocr icon indicating copy to clipboard operation
olmocr copied to clipboard

Toolkit for linearizing PDFs for LLM datasets/training

Results 61 olmocr issues
Sort by recently updated
recently updated
newest added

Bumps [actions/setup-python](https://github.com/actions/setup-python) from 4 to 5. Release notes Sourced from actions/setup-python's releases. v5.0.0 What's Changed In scope of this release, we update node version runtime from node16 to node20 (actions/setup-python#772)....

dependencies
github_actions

Bumps [actions/checkout](https://github.com/actions/checkout) from 3 to 4. Release notes Sourced from actions/checkout's releases. v4.0.0 What's Changed Update default runtime to node20 by @​takost in actions/checkout#1436 Support fetching without the --progress option...

dependencies
github_actions

Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 3 to 4. Release notes Sourced from actions/upload-artifact's releases. v4.0.0 What's Changed The release of upload-artifact@v4 and download-artifact@v4 are major changes to the backend architecture of Artifacts....

dependencies
github_actions

Bumps [actions/cache](https://github.com/actions/cache) from 3 to 4. Release notes Sourced from actions/cache's releases. v4.0.0 What's Changed Update action to node20 by @​takost in actions/cache#1284 feat: save-always flag by @​to-s in actions/cache#1242...

dependencies
github_actions

Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 3 to 4. Release notes Sourced from actions/download-artifact's releases. v4.0.0 What's Changed The release of upload-artifact@v4 and download-artifact@v4 are major changes to the backend architecture of Artifacts....

dependencies
github_actions

### 🐛 Describe the bug I have been following all the instructions on an ubuntu server and cannot import the package from jupyter on vscode. --------------------------------------------------------------------------- ModuleNotFoundError Traceback (most recent...

bug

### 🚀 The feature, motivation and pitch I'm assuming this isn't supported out of the box? I tried this PDF with `allenai/olmOCR-7B-0225-preview` and did not get good results. `{"id": "033dae2f4c12b9b07d00a72702f03ac0639292e4",...

I see a figure at https://olmocr.allenai.org/blog. ![Image](https://github.com/user-attachments/assets/6ade559d-2545-42db-86d5-a2768d2e4b08) I want to add this to my repo as markdown table, may you please share your results?

### 🚀 The feature, motivation and pitch How to use olmocr to provide an HTTP service that allows users to upload PDFs or images, parse the files, output in Markdown...

### 🚀 The feature, motivation and pitch I've done something pretty rough using python and openai Vision, because i had to parse into text some technical manuals, because i have...