Results 538 comments of Amit Dovev

Arabic was using the 'Cube' OCR engine. The code for that engine was removed in version 4.0 and will not be restored. Other languages used another engine which we now...

We can also do something like this in the code: ``` if (lang == 'ara' and oem == 0) { print("Error: Oem 0 is not supported for Arabic"); return EXIT_FAILURE;...

The disable-legacy build option was added by me so a future tesseract version can easily drop the legacy code. See #707 for the context. Currently, there is no way for...

>during build/configure of project? I think the proper way to do it is to add -DDISABLED_LEGACY_ENGINE to the Cflags in the .pc file.

>at runtime? For this one we can add: `bool get_build_option(char* option);`

Also possible. Did you see this: https://github.com/tesseract-ocr/tesseract/issues/2372#issuecomment-480559819 ?

I assume tesseract handle tables in one of these two ways: 1) Tables columns are held in tesseract blocks and cells are held as lines within blocks. 2) Tables rows...

Tesseract considers any table it can recognize as block, so it's neither of the cases.

The table detection code is here: https://github.com/tesseract-ocr/tesseract/blob/master/src/textord/tablefind.cpp

Play with the variables: https://github.com/tesseract-ocr/tesseract/blob/509a6f0ce0e636a9ed92553439f1ed6a56b346c5/src/textord/tablefind.cpp#L143