dinosauria123
dinosauria123
Or this one ? https://hub.docker.com/r/ubma/ocr-fileformat/
Thank you for your report. I will check json output but patches may be delay because now I am busy my job.
Do you want to convert images to hocr ? You may use Tesseract OCR. https://github.com/tesseract-ocr/tesseract
I have checked gcv2hocr but output seems to be fine. Did you use gcvocr.sh to get json output ? Please attach your json output to your comment.
Check here. https://github.com/tesseract-ocr/tesseract/wiki/Command-Line-Usage#hocr-output
I think you have to use multiple tools. for example, hocr to pdf is possible hocr-tools. https://github.com/tmbdev/hocr-tools#hocr-pdf pdf may have many tools to convert to other format...
Do you know Alto ? https://en.wikipedia.org/wiki/ALTO_(XML) If you want to deal with OCR format, Alto is better than hocr. https://github.com/altoxml/documentation/wiki/Software
I never used this, but I think it is what you want ... https://github.com/tabulapdf/tabula-extractor http://tabula.technology/ I think this topic is not related to gcv2hocr, may I close this issue ?
Sorry, The Correct debug message is below : rst:0x8 (TG1WDT_SYS_RESET),boot:0x17 (SPI_FAST_FLASH_BOOT) configsip: 0, SPIWP:0xee clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00 mode:DIO, clock div:1 load:0x3fff0030,len:1184 load:0x40078000,len:13260 load:0x40080400,len:3028 entry 0x400805e4