olmocr icon indicating copy to clipboard operation
olmocr copied to clipboard

Toolkit for linearizing PDFs for LLM datasets/training

Results 61 olmocr issues
Sort by recently updated
recently updated
newest added

### 🐛 Describe the bug I am using: nvidia RTX A6000 48GB Followed the instructions carefully, all seemed to install and be fine. `CUDA_DEVICE_ORDER=PCI_BUS_ID python -m olmocr.pipeline ./localworkspace --pdfs /media/pop/samsung256/x64_gsqld_report_files/e408bd57-03eb-4d08-b92c-ab7bf632cfca/cr_100468_7.pdf...

bug

### 🚀 The feature, motivation and pitch Is there any plan that would work on Apple's M series? ### Alternatives _No response_ ### Additional context _No response_

I've collected some notable pipelines at https://github.com/dantetemplar/pdf-extraction-agenda ![Image](https://github.com/user-attachments/assets/2df6db93-3b98-4de8-b023-2a481f81fa7d)

Hello! Thank you for this contribution! I am very excited to try this model with OpenVINO; I build an advanced system based on qwen2-vl this fall for dense table analysis...

### 🚀 The feature, motivation and pitch Do you have any plans to implement the hOCR output format for OCR positioning data? This would allow OCR results to be embedded...

Changes proposed in this pull request: - Add Replicate Demo ## Before submitting - [x] I've read and followed all steps in the [Making a pull request](https://github.com/allenai/olmocr/blob/main/.github/CONTRIBUTING.md#making-a-pull-request) section of the...

完整的本地部署教程 https://youtu.be/XF3Q_ZjwfaI

### 📚 The doc issue The usage allow param for max rendering and more. Could you provide default values for it when running straight from the example provided? ### Suggest...

documentation

### 🚀 The feature, motivation and pitch Tried and recognized that olmocr is not able to detect and extract table structure. ### Alternatives _No response_ ### Additional context _No response_

### 🐛 Describe the bug 描述: Gateway 错误。日志中显示多次 HTTP GET 请求均返回 502,导致整个处理流程无法继续。经过检查,问题可能与 sglang 模型推理服务在 macOS M2(无 CUDA,依赖 MPS)环境下的兼容性有关,项目的 GPU 检查和服务初始化似乎仍假定存在 NVIDIA GPU。 重现步骤 1. 在 macOS M2(Apple Silicon)环境下,按照 README 指南安装...

bug