Amit Dovev issues

Results 35 issues of


                                            Amit Dovev

Added best traineddatas for 4.00 alpha

https://github.com/tesseract-ocr/tessdata/tree/3a94ddd47be0 @theraysmith , How to present those 'best' files to our users? https://github.com/tesseract-ocr/tesseract/wiki/Data-Files Do you plan to push more updates to the best directory and/or to the root dir in...

question

Hebrew issues

Here, i'm going to raise some issues related to Tesseract's Hebrew support. Dear participants who have interest in Arabic support, I suggest to raise Arabic issues in a separate 'issue',...

German Fraktur

From https://github.com/tesseract-ocr/tesseract/issues/40 @stweil commented >Are there also new data files planned for old German (deu_frak)? I was surprised that the default English model with LSTM could recognize some words. @theraysmith...

Superscripts & subscripts

Copied from 59: ----------------------------------------- @Shreeshrii commented Just checking whether this new training will also address: 2. Correct handling of superscripts ----------------------------------------- @theraysmith commented 2. Correct handling of superscripts Beyond...

[info] OCR Ground Truth Resources

https://github.com/cneud/ocr-gt

Yiddish

From #82 @theraysmith commented >OK I have added desired/forbidden characters for heb and yid I assume that apart from the 3 unique characters that you listed (for each) the list...

Correct handling of TM sign

Copied from 59 ------------------------------------------------ [reply to @Shreeshrii] @theraysmith commented TM is also difficult, as it is in conflict with the needs of fi/fl, which should not appear in the output.

License

Hi, What's the license of this project?

Doxygen: Add an option to only document the API

This will only parse `include/tesseract`. Maybe make this the default, including here: https://tesseract-ocr.github.io/tessapi/5.x/files.html

documentation

Always do serialization in little endian order

https://github.com/tesseract-ocr/tesseract/issues/518#issuecomment-277514434 >@stweil commented on 5 Feb 2017 > >There are different approaches possible to get support for big endian machines: > >1. Write training data files in native endian byte...

endianness

enhancement

priority: low