pyocr icon indicating copy to clipboard operation
pyocr copied to clipboard

A Python wrapper for Tesseract and Cuneiform -- Moved to Gnome's Gitlab

Results 18 pyocr issues
Sort by recently updated
recently updated
newest added

For some reason there are differences between the references and the actual results. And it seem the actual results are good, so it's problably a bug in ```update_test_data.sh```

bug
to study

Is it possible to get a confidence score for the predictions (not orientation) ?

feature request

I see the code of https://github.com/openpaperwork/pyocr/blob/master/src/pyocr/libtesseract/tesseract_raw.py#L381 https://github.com/openpaperwork/pyocr/blob/master/src/pyocr/libtesseract/__init__.py#L110 It's showed the libtesseract was supported the numeric mode,but why in `get_available_builders` not allow us use `DigitBuilder`?

bug

Hi Team, Currently i am using pyocr with tesseract 3.05.01. I am using pyocr.get_available_tools() to get tesseract. Is there any way i can preserve_interword_spaces for tesseract with help of pyocr.

support

--load_system_dawg 0 would be helpful as an argument in image_to_text, perhaps as an options dictionary. Feel free to call it something that makes it language agnostic

feature request

Someone has been reporting crashes of Paperwork when running the OCR. They are using Tesseract 3.04.01 .. so there may be something wrong with the libtesseract binding. (Note: currently, the...

bug
to study

Cuneiform tends to stop reading pages when it reachs a large non-readable area. Because of this, when using Cuneiform, all the keywords are not actually extracted. A way to work...

feature request
to study

Even when using the LineBoxBuilder, it seems too much data is stripped from the hOCR files.

feature request