Jake Poznanski

Results 122 comments of Jake Poznanski

Yeah, that's why I don't exactly want to merge this. I am not sure what else is different between this code and theirs which is causing lower scores.

Open to contributions btw :D

Yeah, sadly some bitrot has occurred... In the latest release from yesterday, the python -m olmocr.bench.convert olmocr_pipeline --dir ./olmOCR-bench/bench_data should run for the olmOCR case again.

Hi, glad you were able to get it to work! Yeah, we are considering our options here. There is something that DeepInfra has published: https://github.com/deepinfra/ocr-tools https://deepinfra.com/allenai/olmOCR-7B-0725-FP8 But I think you'd...

Newer versions have supported local files for a while now.

Yes, you can try out the benchmark tools, but the full benchmark dataset is not available yet. We are still preparing it, hope to be done end of the month.

Information on calling the raw model is in the model card: https://huggingface.co/allenai/olmOCR-7B-0225-preview But I do recommend to run it with the pipeline.py method described in the docs.

We use the following code in the web demo, sorry it's not in the repo yet, we should add it in v2. ```python import os import tempfile async def convert_image_to_pdf(png_file:...

@SkaarFacee If you have an idea, then please let me know. Interesting, it's definitely not normal. What sort of host are you running this on, is it within docker? What...

Hmm, interesting, when we run multi-gpus, we would run it as 8 separate docker containers on one host for example. Does the error go away if you run just 1...