tesseract
tesseract copied to clipboard
Tesseract Open Source OCR Engine (main repository)
Before you submit an issue, please review [the guidelines for this repository](https://github.com/tesseract-ocr/tesseract/blob/main/CONTRIBUTING.md). Please report an issue only for a BUG, not for asking questions. Note that it will be much...
### Environment * **Tesseract Version**: 4.1.1 * **Platform**: Windows 64-bit ### Current Behavior: Music symbol is detected as letter 'O' ![SubtitleEdit_2020-04-30_02-51-35](https://user-images.githubusercontent.com/5639078/80664116-d104dc80-8a8d-11ea-95a5-e72e496c4039.png) ### Expected Behavior: To recognise the symbol? It used...
I suggest to focus on 5.x for 2022 at least. That means we should not break the API (and ABI?). Use C++17, not C++20/C++23.
While trying to plot the error rates for training, I have come across an anomaly. I use the LOG file generated from messages output during lstmtraining run, which also out...
`tesseract -v` `tesseract 5.2.0 leptonica-1.82.0 libgif 5.1.9 : libjpeg 8d (libjpeg-turbo 2.1.1) : libpng 1.6.37 : libtiff 4.3.0 : zlib 1.2.11 : libwebp 1.2.2 : libopenjp2 2.4.0 Found AVX512BW Found...
I'm investigating my issue earlier spotted in https://github.com/tesseract-ocr/tesseract/pull/3141 further. In this [picture](https://user-images.githubusercontent.com/3341558/179373903-ef6cc246-f4e5-4633-a762-ded4dd22708f.jpg) above the text 'wis-clear' on the right, there is a text 'print'. This text print disappears completely and...
I have an image which is of 300 dpi which is converted into grayscale. When I try to print the pytesseract.image_to_string with the configuration of config="--psm 6" it produces the...
### Environment * **Tesseract Version**: 4.1.3, but affects latest *main* branch as well * **Platform**: `Linux localhost.localdomain 5.3.18-150300.59.68-default #1 SMP Wed May 4 11:29:09 UTC 2022 (ea30951) x86_64 x86_64 x86_64...
Hi, many thanks to this fantastic work and all of you! I am here to report some wired situations about coordinates when chi_tra_vert_*.traineddata is used. > tesseract 4.1.0 leptonica-1.78.0 libgif...
This is for 6.0. See https://github.com/tesseract-ocr/tesseract/pull/3684#issuecomment-999690035 Use std::format when the used STL supports it, fallback to [fmtlib](https://github.com/fmtlib/fmt) otherwise. MSVC has full support for std::format. Clang has almost complete support. Currently,...