Amit Dovev comments

Results 538 comments of


                                            Amit Dovev

Box File disorder, Arabic Language

What you show here is 'by design'. This should not cause any problem in training process and characters recognition for RTL languages.

Box File disorder, Arabic Language

>I wonder if the bidi integration is working correctly for LSTM, as the accuracy with Arabic is unsatisfactory. Ray, According to your tests, how does Hebrew (another RTL language) perform?...

Box File disorder, Arabic Language

About `--noextract_font_properties` . Ray confirmed it here: https://github.com/tesseract-ocr/tesseract/issues/634#issuecomment-272027231

Box File disorder, Arabic Language

>Are glyph metrics used for LSTM training? I believe the answer is 'No'. @theraysmith, can you confirm that?

Box File disorder, Arabic Language

`textord_min_linesize` is a hint for the layout analysis step in Tesseract. If the layout analysis step does not 'cut' the lines properly, the next step - the lines' text recognition,...

Box File disorder, Arabic Language

[Tesseract release notes July 11 2015 - V3.04.00](https://github.com/tesseract-ocr/tesseract/wiki/ReleaseNotes#tesseract-release-notes-july-11-2015---v30400) >Major change to improve layout analysis for heavily diacritic languages: Thai, Vietnamese, Kannada, Telugu etc. From DAS2016 slide 5 - 'Page Layout...