tesseract icon indicating copy to clipboard operation
tesseract copied to clipboard

Tesseract Open Source OCR Engine (main repository)

Results 218 tesseract issues
Sort by recently updated
recently updated
newest added

As the development of OCR is moving towards LSTM, new difficulties arise, such as having created lstm/traineddata for different fonts. **Example situations:** - I have created multiple lstm/ traineddata models,...

feature request

Font detection works fine in PSM_SINGLE_WORD mode. In PSM_SINGLE_LINE mode it is not working well. For recognizing more text (column, full page) font detection does not work at all. For...

legacy

Tesseract Version: 5 OS: LInux ubuntu 20 command: tesseract imagename --dpi 300 -l spa --psm 6 --oem 1 tried with different psm values, value 6 offers the best results **original...

layout analysis
paragraphs

### Environment tesseract v5.0.0-alpha.20191030 Windows 64bit ### Current Behavior: When in text mode, each line in a paragraph is separated by line-break: ``` Line 1 of foo or bar, another...

feature request
question
paragraphs

### Environment * **Tesseract Version**: 4.0.0 ~~4.0.0-beta.1 from https://packages.debian.org/stretch-backports/tesseract-ocr~~ * **Commit Number**: 51316994ccae0b48692d547030f26c0969308214 ~~c3ed6f036064e54e34f75275f66c70dd924527bf~~ * **Platform**: `Linux my-machine 4.9.0-8-amd64 #1 SMP Debian 4.9.130-2 (2018-10-27) x86_64 GNU/Linux` ### Current Behavior: Tesseract...

layout analysis
paragraphs

----------------------- ### Environment v5.0.0 alpha Windows 10-64 bit ### Current Behavior: I have two very similar documents seen here and here: In the former, Tesseract with psm=1 parses out two...

layout analysis

According to Tesseract 4.0.0 [Release Notes](https://github.com/tesseract-ocr/tesseract/wiki/ReleaseNotes#tesseract-release-notes-oct-29-2018---v400) : > Added a new OCR engine that uses neural network system based on LSTMs, with major accuracy gains. My testing with this new...

accuracy
textlnes inversion

I tried to test with 4.0 with Telugu language and observed many words are dropping in between I given a 300 DPI PNG file.is this a known issue ? If...

layout analysis

Before you submit an issue, please review [the guidelines for this repository](https://github.com/tesseract-ocr/tesseract/blob/master/CONTRIBUTING.md). Please report an issue only for a BUG, not for asking questions. Note that it will be much...

### Environment * **Tesseract Version**: Tesseract v5.0.0-alpha-20210401 * **Commit Number**: 38f0fdc * **Platform**: Linux/Debian ### Current Behavior: Repeats parts of preceding or following line. Looks like some memory constructs are...

bug
layout analysis