tesseract
tesseract copied to clipboard
Tesseract Open Source OCR Engine (main repository)
``` $ training/lstmtraining --model_output ~/tesstutorial/sanskrit2003_from_full/sanskrit2003 \ > --continue_from ~/tesstutorial/sanskrit2003_from_full/san.lstm \ > --train_listfile ~/tesstutorial/santrain/san.training_files.txt \ > --target_error_rate 0.01 Loaded file /home/shree/tesstutorial/sanskrit2003_from_full/sanskrit2003_checkpoint, unpacking... Successfully restored trainer from /home/shree/tesstutorial/sanskrit2003_from_full/sanskrit2003_checkpoint Loaded 1746/1746 pages (0-1746)...
i'm trying to train new font in korean by using tesseract_lstm and fine tuning but when i try fine tuning encoding error appears almost every line error code below ```...
------------------------ ### Environment Tesseract Version: v5.1.0.20220510 ### Current Behavior: Extracting tessdata components from chi_sim.traineddata Wrote chi_sim.lstm Version:4.00.00alpha:chi_sim:synth20170629:[1,48,0,1Ct3,3,16Mp3,3Lfys64Lfx96Lrx96Lfx512O1c1] 0:config:size=1966, offset=192 17:lstm:size=12152851, offset=2158 18:lstm-punc-dawg:size=282, offset=12155009 19:lstm-word-dawg:size=590634, offset=12155291 20:lstm-number-dawg:size=82, offset=12745925 21:lstm-unicharset:size=258834, offset=12746007 22:lstm-recoder:size=72494,...
### Environment * **Tesseract Version**: Current main repository (4.00.00alpha) * **Platform**: Windows7 32-bit ### Current Behavior: Its recognize Arabic Characters and can not recognize Arabic numbers (ارقام عربى 0123456789) I...
This PR improves the positions of symbol bounding boxes in cases when LSTM model is used. Up to 20 times less errors have been observed in sample images. This PR...
Lines with low line spacing are missing. I haven't found a way to recognize them. If line is not separated enough it won't be detected. Could Tesseract allow recognizing text...
### Environment * **Tesseract Version**: Tesseract Version: 5.1.0 * **Platform**: Windows 32-bit ### Current Behavior: I have different processes that I work with OCR on a daily basis. Once, I...
I'm trying to retrain this Tesseract Engine (https://gitlab.com/pninim.org/tessdata_heb_rashi/-/blob/main/tesseract_4.1.1/TRAINING.md) for a specific obscure Hebrew Script for Tesseract 5. I'm trying to, using the command listed there, get a list of available...
### Environment * **Tesseract Version**: 4.1.0 * **Commit Number**: 5280bbcade4e2dec5eef439a6e189504c2eadcd9 * **Platform**: Windows 10, 64-bit, Version 21H1 (OS Build 19043.1526) ### Current Behavior: On a certain image, an integer division-by-zero...
### Environment * **Tesseract Version**: 4.1.1 * **Platform**: macOS Catalina 10.15 ### Current Behavior: Whenever I execute `$ tesseract images/IMG_3958.HEIC output/grocery_bill` I get this error: ``` $ tesseract images/IMG_3958.HEIC output/grocery_bill...