Amit Dovev comments

Results 538 comments of


                                            Amit Dovev

peculiarities when running text2image on windows

The dotted\_circle changes in #381 caused problems (in Linux at least). See: https://github.com/tesseract-ocr/tesseract/blob/5bb97f966885/training/pango_font_info.cpp#L438

peculiarities when running text2image on windows

The relevant code was rewritten in Tesseract 5.0. @stweil, Do you know if all the issues that were mentioned by the OP were solved?

PrintVariables produces config file that ReadConfigFile does not properly read

>The simplest solution would be to remove the descriptions from the PrintVariables output We don't want to do this because that function is used by the command line tool and...

PrintVariables produces config file that ReadConfigFile does not properly read

How about using: `if (fp == stdout)` or: `(fp == stdout) ? ... : ...` in: https://github.com/tesseract-ocr/tesseract/blob/60fd2b4abaa9c5c5c42d32db57576bc95d28a78a/src/ccutil/params.cpp#L164

PrintVariables produces config file that ReadConfigFile does not properly read

>I think the ability to dump the user's current parameters and easily restore them later is useful Related issue: #3260

PrintVariables produces config file that ReadConfigFile does not properly read

https://github.com/tesseract-ocr/tesseract/issues/3670#issuecomment-985273726 This comment and my comments below it are also related to this issue.

Fine Tuning results in wrong transcription

See #735

Fine Tuning results in wrong transcription

>I used the GitHub project OCR-D Train to generate the .box and .lstmf files required for training Do they handle bidi text?

Fine Tuning results in wrong transcription

The chars in the box files need to be in visual order from left to right, but the chars in your box files are in logical order from right to...

Fine Tuning results in wrong transcription

>The training text in langdata_lstm/ara is only 80 lines or so. @Shreeshrii, Please report about this specific issue in: https://github.com/tesseract-ocr/langdata_lstm