tesseract
tesseract copied to clipboard
Tesseract Open Source OCR Engine (main repository)
### Environment * **Tesseract Version**: 4.1.1-rc2 , language data: chi_sim.traineddata * **Commit Number**: * **Platform**: Windows 7 64-bit ### Current Behavior: When there is a table on the picture, it...
When lines are close and letters on different lines touch one another, words are not bounded correctly. ### Environment * **Tesseract Version**: 5.0.0-alpha and 4.1 and 4.0 * **Commit Number**:...
My friends told me they waited more than a day. I've waited >1 hour. With similar images process finishes in a few minutes. [8e3b2319a6bc41cab2f5c4507ea7e212.zip](https://github.com/tesseract-ocr/tesseract/files/4213441/8e3b2319a6bc41cab2f5c4507ea7e212.zip) ``` const char* GetText2(const char* input)...
Internally Tesseract detects different kinds of regions, not only text regions. Currently regions for images and horizontal or vertical lines are also written to ALTO, hOCR and text output as...
A batch of more than 40000 `tesseract` runs with images from the net randomly hangs in `curl_easy_perform`. Maybe this is caused by problems of the server which delivers the images....
Tesseract Open Source OCR Engine v4.0.0-beta.4-18-g4370 text2image 4.0.0-beta.4-18-g4370 ### Current Behavior: While I am using text2iamge to create tif/box files, I happened to find an abnormal box on a blank,...
### Environment * **Tesseract Version**: 5.1.0 * **Platform**: Windows 32-bit, compiled under MSVC 2017 ### Current Behavior: I am writing an application where I can select different pieces of text...
https://github.com/tesseract-ocr/tesseract/issues/648#issuecomment-271987456 >Indic may be troubled by the length of the compressed codes used. @theraysmith Can you explain a little more about this?
I am working on a [WebAssembly build](https://github.com/robertknight/tesseract-wasm) of Tesseract targeted at browsers. To reduce the download size I have compiled Leptonica with all optional image formats disabled, since browsers already...
Hi there, I've got some specific images that output the following on linux: ``` Tesseract Open Source OCR Engine v3.05.00dev with Leptonica Error in boxClipToRectangle: box outside rectangle Error in...