Amit Dovev
Amit Dovev
The dotted\_circle changes in #381 caused problems (in Linux at least). See: https://github.com/tesseract-ocr/tesseract/blob/5bb97f966885/training/pango_font_info.cpp#L438
The relevant code was rewritten in Tesseract 5.0. @stweil, Do you know if all the issues that were mentioned by the OP were solved?
>The simplest solution would be to remove the descriptions from the PrintVariables output We don't want to do this because that function is used by the command line tool and...
How about using: `if (fp == stdout)` or: `(fp == stdout) ? ... : ...` in: https://github.com/tesseract-ocr/tesseract/blob/60fd2b4abaa9c5c5c42d32db57576bc95d28a78a/src/ccutil/params.cpp#L164
>I think the ability to dump the user's current parameters and easily restore them later is useful Related issue: #3260
https://github.com/tesseract-ocr/tesseract/issues/3670#issuecomment-985273726 This comment and my comments below it are also related to this issue.
>I used the GitHub project OCR-D Train to generate the .box and .lstmf files required for training Do they handle bidi text?
The chars in the box files need to be in visual order from left to right, but the chars in your box files are in logical order from right to...
>The training text in langdata_lstm/ara is only 80 lines or so. @Shreeshrii, Please report about this specific issue in: https://github.com/tesseract-ocr/langdata_lstm