tesstrain
tesstrain copied to clipboard
Train Tesseract LSTM with make
Currently tesseract is not at all good at maths equations , Can tesseract be used to use the correct training data and work for maths equations , same as mathpix...
so i want to finetune ara.traineddata in the traineddata_best repo to handle extended words like the this : data:image/s3,"s3://crabby-images/15609/15609931585d51b8a816767120de5a293a669076" alt="sample_9" to do that i made a list of lines with the...
I have recently started learning and experimenting with Tesseract OCR. I have done a training for a new font using the tesstrain. Now my use case is that I want...
I've created *.exp0.gt.txt as a base for manual ground truth creation using [Shreeshrii's shell script](https://github.com/tesseract-ocr/tesstrain/issues/7#issuecomment-419714852) and the files contain a space before and after the text (no newlines etc). Example:...
Kindly help me to fix this issue i dont know what went wrong plz guide python3 shuffle.py 0 "data/OCRA/all-lstmf" + head -n 269 data/OCRA/all-lstmf + tail -n 30 data/OCRA/all-lstmf combine_lang_model...
Hi everyone, I'm working on the Urdu language to enhance the accuracy of tesseract. I have used the below code to get the output however, the result was extremely bad....
Hi All, I'm having trouble executing the fine-tunning on this repository. Below is my code which I run on my Jupyter notebook: ``` **Step1:** !git clone https://github.com/tesseract-ocr/tesstrain.git Step-2: %cd tesstrain...
As mentioned by @stefan6419846 in https://github.com/madmaze/pytesseract/issues/508 , there is a python wrapper for training in [tesstrain/src/](https://github.com/tesseract-ocr/tesstrain/tree/main/src) , which unfortunately is not documented in [tesseract](https://github.com/tesseract-ocr/tesseract), [tessdoc](https://tesseract-ocr.github.io/tessdoc/) and [tesstrain](https://github.com/tesseract-ocr/tesstrain/) repositories. From my...
Normalization failed / Invalid start of grapheme sequence Error While training the tesseract model
Normalization failed for string 'ଜୀବନକୁ ନିବିଡ଼ ଭାବେ ଏକନ୍ୱିତ କରିଛନ୍ତି' Invalid start of grapheme sequence:D=0xb71 Normalization failed for string 'ପରମ୍ପରାକୁ ଅବଲମ୍ୱନ କରିଛନ୍ତି, ସେତିକି ମଧ୍ୟ' Invalid start of grapheme sequence:M=0xb48 Normalization failed...
[Archive.zip](https://github.com/tesseract-ocr/tesstrain/files/10204085/Archive.zip) Uploaded my training log and .traineddata files along with a sample image. Log seems to indicate that the model is correctly getting the text, but if I try running...