Amit Dovev comments

Results 538 comments of


                                            Amit Dovev

Legacy ara language not working with recent versions of tesseract

Arabic was using the 'Cube' OCR engine. The code for that engine was removed in version 4.0 and will not be restored. Other languages used another engine which we now...

Legacy ara language not working with recent versions of tesseract

We can also do something like this in the code: ``` if (lang == 'ara' and oem == 0) { print("Error: Oem 0 is not supported for Arabic"); return EXIT_FAILURE;...

Is there a preferred way how 3rd party projects can check option used for tesseract build

The disable-legacy build option was added by me so a future tesseract version can easily drop the legacy code. See #707 for the context. Currently, there is no way for...

Is there a preferred way how 3rd party projects can check option used for tesseract build

>during build/configure of project? I think the proper way to do it is to add -DDISABLED_LEGACY_ENGINE to the Cflags in the .pc file.

Is there a preferred way how 3rd party projects can check option used for tesseract build

>at runtime? For this one we can add: `bool get_build_option(char* option);`

Is there a preferred way how 3rd party projects can check option used for tesseract build

Also possible. Did you see this: https://github.com/tesseract-ocr/tesseract/issues/2372#issuecomment-480559819 ?

[Feature Request] Table structure extraction at the API

I assume tesseract handle tables in one of these two ways: 1) Tables columns are held in tesseract blocks and cells are held as lines within blocks. 2) Tables rows...

[Feature Request] Table structure extraction at the API

Tesseract considers any table it can recognize as block, so it's neither of the cases.

[Feature Request] Table structure extraction at the API

The table detection code is here: https://github.com/tesseract-ocr/tesseract/blob/master/src/textord/tablefind.cpp

[Feature Request] Table structure extraction at the API

Play with the variables: https://github.com/tesseract-ocr/tesseract/blob/509a6f0ce0e636a9ed92553439f1ed6a56b346c5/src/textord/tablefind.cpp#L143