tesseract icon indicating copy to clipboard operation
tesseract copied to clipboard

Tesseract Open Source OCR Engine (main repository)

Results 218 tesseract issues
Sort by recently updated
recently updated
newest added

### Environment: Tesseract Latest Master from GitHub, Ubuntu 20.04.2 User References: @bertsky @stweil ### BackGround The problem named Diplopia (courtesy of @bertsky) consists in there being more than 1 character...

bounding box
diplopia

### Environment * **Tesseract Version**: Various, `4.1.1`, `5.0.0 v20201231` * **Platform**: Linux, 64 bit ### Current Behavior: In some cases, Tesseract fully automatic page segmentation does not pick up page...

accuracy
layout analysis

### Environment * **Tesseract Version**: Latest `master` * **Commit Number**: (`23ed59bd7bca777e4e104c4ee540843373aa9869` * **Platform**: `Linux gentoo-x13 5.11.7-gentoo-dist #1 SMP Wed Mar 17 21:03:41 -00 2021 x86_64 AMD Ryzen 7 PRO 4750U...

performance
process hangs
binarization

Solution to issue #3590 (makebox doesn't output horizontal coordinates of textangle 90 content). I followed these lines back to 2010, there has been no-one fiddling with these lines, however they...

$ ./tesstrain.sh --fonts_dir /home/anupamjain/Documents/workspace/ocr_training/fonts --fontlist 'OCRB' --lang eng --linedata_only --langdata_dir /home/anupamjain/Documents/workspace/ocr_training/langdata_lstm --tessdata_dir /home/anupamjain/Documents/workspace/ocr_training/tesseract/tessdata --save_box_tiff --maxpages 10 --output_dir /home/anupamjain/Documents/workspace/ocr_training/train --exposures "0" === Starting training for language 'eng' [Thursday 28 April 2022...

Tesseract is doing a fantastic Job at processing the input image! original `demo.jpg` size is `3614 Kb` `tesseract demo.jpg out get.images` gives me `demo.processed.tif` which is only `35 kb` I'd...

feature request
PDF

https://groups.google.com/d/msgid/tesseract-ocr/1a3e8773-7151-48f9-92bb-fda888293eab%40googlegroups.com?utm_medium=email&utm_source=footer > While the single "S" is recognized correctly, the text "2S" is recognized as "25". Here is link to the test image: https://03054610326450256607.googlegroups.com/attach/b8b86693ac072/2s.png?part=0.4&view=1

accuracy

Signed-off-by: Stefan Weil

enhancement

This ensures that transformations like unicode normalisation are done on the truth output as well as the OCR output, so that you can compare the two properly. Before this a...