tesstrain icon indicating copy to clipboard operation
tesstrain copied to clipboard

Train Tesseract LSTM with make

Results 63 tesstrain issues
Sort by recently updated
recently updated
newest added

Currently tesseract is not at all good at maths equations , Can tesseract be used to use the correct training data and work for maths equations , same as mathpix...

so i want to finetune ara.traineddata in the traineddata_best repo to handle extended words like the this : ![sample_9](https://github.com/tesseract-ocr/tesstrain/assets/36017867/179090ca-04b4-4402-a484-f2365156a2c1) to do that i made a list of lines with the...

I have recently started learning and experimenting with Tesseract OCR. I have done a training for a new font using the tesstrain. Now my use case is that I want...

I've created *.exp0.gt.txt as a base for manual ground truth creation using [Shreeshrii's shell script](https://github.com/tesseract-ocr/tesstrain/issues/7#issuecomment-419714852) and the files contain a space before and after the text (no newlines etc). Example:...

Kindly help me to fix this issue i dont know what went wrong plz guide python3 shuffle.py 0 "data/OCRA/all-lstmf" + head -n 269 data/OCRA/all-lstmf + tail -n 30 data/OCRA/all-lstmf combine_lang_model...

bug

Hi everyone, I'm working on the Urdu language to enhance the accuracy of tesseract. I have used the below code to get the output however, the result was extremely bad....

Hi All, I'm having trouble executing the fine-tunning on this repository. Below is my code which I run on my Jupyter notebook: ``` **Step1:** !git clone https://github.com/tesseract-ocr/tesstrain.git Step-2: %cd tesstrain...

As mentioned by @stefan6419846 in https://github.com/madmaze/pytesseract/issues/508 , there is a python wrapper for training in [tesstrain/src/](https://github.com/tesseract-ocr/tesstrain/tree/main/src) , which unfortunately is not documented in [tesseract](https://github.com/tesseract-ocr/tesseract), [tessdoc](https://tesseract-ocr.github.io/tessdoc/) and [tesstrain](https://github.com/tesseract-ocr/tesstrain/) repositories. From my...

Normalization failed for string 'ଜୀବନକୁ ନିବିଡ଼ ଭାବେ ଏକନ୍ୱିତ କରିଛନ୍ତି' Invalid start of grapheme sequence:D=0xb71 Normalization failed for string 'ପରମ୍ପରାକୁ ଅବଲମ୍ୱନ କରିଛନ୍ତି, ସେତିକି ମଧ୍ୟ' Invalid start of grapheme sequence:M=0xb48 Normalization failed...

[Archive.zip](https://github.com/tesseract-ocr/tesstrain/files/10204085/Archive.zip) Uploaded my training log and .traineddata files along with a sample image. Log seems to indicate that the model is correctly getting the text, but if I try running...

stale