Results 538 comments of Amit Dovev

@stweil, Should leading and trailing spaces be removed from the GT in tesseract training tool or by https://github.com/tesseract-ocr/tesstrain ?

https://github.com/tesseract-ocr/tesstrain/search?q=strip

@Steve-Glass, >### Target date > >Available as of 1/30/2023 You mean 1/30/**2024**

[hocr.zip](https://github.com/tesseract-ocr/tesseract/files/11131967/hocr.zip)

I agree with the suggestion to change the file extension from `.hocr` to `_hocr.html`. @stweil, @zdenop, your opinion? CC: @kba, @bertsky

>..but all Tesseract renderers also run `Recognize` conditionally... The pdf renderer does not call `Recognize()`.

What's the method you use to disable OpenMP? Commenting `AC_OPENMP` or something else?

`--disable-shared --disable-static` seems to be equivalent to just `--disable-shared`.

>Regarding precision of the dot product: the addition is the critical part for the accuracy. Did you ever try some of the algorithms which help to improve that part, e....

The hocr output contains the skew angle of the text lines. You can try to use this info and manually reskew the image and then rerun Tesseract.