tesseract icon indicating copy to clipboard operation
tesseract copied to clipboard

Tesseract Open Source OCR Engine (main repository)

Results 218 tesseract issues
Sort by recently updated
recently updated
newest added

``` $ training/lstmtraining --model_output ~/tesstutorial/sanskrit2003_from_full/sanskrit2003 \ > --continue_from ~/tesstutorial/sanskrit2003_from_full/san.lstm \ > --train_listfile ~/tesstutorial/santrain/san.training_files.txt \ > --target_error_rate 0.01 Loaded file /home/shree/tesstutorial/sanskrit2003_from_full/sanskrit2003_checkpoint, unpacking... Successfully restored trainer from /home/shree/tesstutorial/sanskrit2003_from_full/sanskrit2003_checkpoint Loaded 1746/1746 pages (0-1746)...

bug
training
encoding failed

i'm trying to train new font in korean by using tesseract_lstm and fine tuning but when i try fine tuning encoding error appears almost every line error code below ```...

training
encoding failed

------------------------ ### Environment Tesseract Version: v5.1.0.20220510 ### Current Behavior: Extracting tessdata components from chi_sim.traineddata Wrote chi_sim.lstm Version:4.00.00alpha:chi_sim:synth20170629:[1,48,0,1Ct3,3,16Mp3,3Lfys64Lfx96Lrx96Lfx512O1c1] 0:config:size=1966, offset=192 17:lstm:size=12152851, offset=2158 18:lstm-punc-dawg:size=282, offset=12155009 19:lstm-word-dawg:size=590634, offset=12155291 20:lstm-number-dawg:size=82, offset=12745925 21:lstm-unicharset:size=258834, offset=12746007 22:lstm-recoder:size=72494,...

training
encoding failed

### Environment * **Tesseract Version**: Current main repository (4.00.00alpha) * **Platform**: Windows7 32-bit ### Current Behavior: Its recognize Arabic Characters and can not recognize Arabic numbers (ارقام عربى 0123456789) I...

traineddata
eastern arabic numerals

This PR improves the positions of symbol bounding boxes in cases when LSTM model is used. Up to 20 times less errors have been observed in sample images. This PR...

bounding box

Lines with low line spacing are missing. I haven't found a way to recognize them. If line is not separated enough it won't be detected. Could Tesseract allow recognizing text...

layout analysis

### Environment * **Tesseract Version**: Tesseract Version: 5.1.0 * **Platform**: Windows 32-bit ### Current Behavior: I have different processes that I work with OCR on a daily basis. Once, I...

I'm trying to retrain this Tesseract Engine (https://gitlab.com/pninim.org/tessdata_heb_rashi/-/blob/main/tesseract_4.1.1/TRAINING.md) for a specific obscure Hebrew Script for Tesseract 5. I'm trying to, using the command listed there, get a list of available...

text2image

### Environment * **Tesseract Version**: 4.1.0 * **Commit Number**: 5280bbcade4e2dec5eef439a6e189504c2eadcd9 * **Platform**: Windows 10, 64-bit, Version 21H1 (OS Build 19043.1526) ### Current Behavior: On a certain image, an integer division-by-zero...

Undefined Behaviour

### Environment * **Tesseract Version**: 4.1.1 * **Platform**: macOS Catalina 10.15 ### Current Behavior: Whenever I execute `$ tesseract images/IMG_3958.HEIC output/grocery_bill` I get this error: ``` $ tesseract images/IMG_3958.HEIC output/grocery_bill...

question
leptonica