gcv2hocr icon indicating copy to clipboard operation
gcv2hocr copied to clipboard

gcv2hocr converts from Google Cloud Vision OCR output to hocr to make a searchable pdf.

Results 17 gcv2hocr issues
Sort by recently updated
recently updated
newest added

python gcv2hocr.py Capture.jpg.json > capture.hocr Traceback (most recent call last): File "gcv2hocr.py", line 146, in page = fromResponse(resp, **args.__dict__) File "gcv2hocr.py", line 99, in fromResponse word.htmlid="word_%d_%d" % (len(page.content) - 1,...

In the case of gcv2hocr.py, vertical Japanese text invisible fonts positions are bottom of the PDF document. In gcv2hocr (C version) does better output than gcv2hocr.py but it use CR...

Hello, Thanks for your code ! I have a issue on this file when I try to convert it to hocr. jpeg to json was done by gcvocr.sh Find attached...

If you need to add your language support, please check the following pages. https://github.com/filak/hOCR-to-ALTO/blob/master/codes_lookup.xml You will see a3h="***" in the code, This is a language code written in hOCR.

gcv2hocr does not support Japanese vertical text. This will be support when Google Cloud Vision OCR support it.

I found Microsoft has opened their computer vision service. https://www.microsoft.com/cognitive-services/en-us/computer-vision-api It has OCR json output but the format is different from google's one. Shall we need to make mscv2hocr ?

Konstantin Baierer made a Python port. Continue discussions. https://github.com/dinosauria123/gcv2hocr/pull/3