Amit Dovev comments

Results 538 comments of


                                            Amit Dovev

Space corrupts the trained model

@stweil, Should leading and trailing spaces be removed from the GT in tesseract training tool or by https://github.com/tesseract-ocr/tesstrain ?

Space corrupts the trained model

https://github.com/tesseract-ocr/tesstrain/search?q=strip

GitHub Action: New M1 runner available to all plans, including open source 🚀

@Steve-Glass, >### Target date > >Available as of 1/30/2023 You mean 1/30/**2024**

The hocr output is not displayed as xhtml in Chrome

[hocr.zip](https://github.com/tesseract-ocr/tesseract/files/11131967/hocr.zip)

The hocr output is not displayed as xhtml in Chrome

I agree with the suggestion to change the file extension from `.hocr` to `_hocr.html`. @stweil, @zdenop, your opinion? CC: @kba, @bertsky

Tesseract creates hOCR output without text results

>..but all Tesseract renderers also run `Recognize` conditionally... The pdf renderer does not call `Recognize()`.

RFC: Tesseract performance

What's the method you use to disable OpenMP? Commenting `AC_OPENMP` or something else?

RFC: Tesseract performance

`--disable-shared --disable-static` seems to be equivalent to just `--disable-shared`.

RFC: Tesseract performance

>Regarding precision of the dot product: the addition is the critical part for the accuracy. Did you ever try some of the algorithms which help to improve that part, e....

Orientation detection "asymmetrical"

The hocr output contains the skew angle of the text lines. You can try to use this info and manually reskew the image and then rerun Tesseract.