EasyOCR icon indicating copy to clipboard operation
EasyOCR copied to clipboard

Training a custom OCR

Open Jacky2357 opened this issue 1 year ago • 3 comments

while Training i get training and validation accuracy of about 90% but when i test the custom model my accuracy shows less than 5%. can u provide step to use custom training .I am using the steps provided in "https://www.youtube.com/watch?v=-j3TbyceShY&t=207s". Also how can i use trained weights after 20,000 iteration to retrain for another 20000.Means how should i use this or another pretrained weights for transfer learning. The repository doesnot give a clear idea of this.

Jacky2357 avatar Jun 21 '23 07:06 Jacky2357

For using a pre-trained or last fine-tuned model, add the pre-trained or last fine-tuned model in trainer/saved_models folder and mention model path mentioned in en_filtered_config.yaml file to saved_model: '' #'saved_models/en_filtered/iter_300000.pth'

  1. You can download models using recognition_models URLs mentioned in the ``easyocr/config.py``` file. If you want to train gen1 models then change parameters as per requirements
optim: adam / adadelta
lr : # Make is small like < 0.001
FeatureExtraction : VGG / ResNet
input_channel: 1
output_channel: 512
hidden_size: 512
new_prediction: False 
freeze_FeatureFxtraction: False  # To freeze VGG/ ResNet weights
freeze_SequenceModeling: False # To freeze BiLSTM weights 
characters: # Mention exact character_list found in recognition_models dictionary

recognition_models dictionary = https://github.com/JaidedAI/EasyOCR/blob/c999505ef6b43be1c4ee36aa04ad979175178352/easyocr/config.py#L53C1-L53C18

akshayrakate avatar Sep 11 '23 13:09 akshayrakate

For using a pre-trained or last fine-tuned model, add the pre-trained or last fine-tuned model in trainer/saved_models folder and mention model path mentioned in en_filtered_config.yaml file to saved_model: '' #'saved_models/en_filtered/iter_300000.pth'

  1. You can download models using recognition_models URLs mentioned in the ``easyocr/config.py``` file. If you want to train gen1 models then change parameters as per requirements
optim: adam / adadelta
lr : # Make is small like < 0.001
FeatureExtraction : VGG / ResNet
input_channel: 1
output_channel: 512
hidden_size: 512
new_prediction: False 
freeze_FeatureFxtraction: False  # To freeze VGG/ ResNet weights
freeze_SequenceModeling: False # To freeze BiLSTM weights 
characters: # Mention exact character_list found in recognition_models dictionary

recognition_models dictionary = https://github.com/JaidedAI/EasyOCR/blob/c999505ef6b43be1c4ee36aa04ad979175178352/easyocr/config.py#L53C1-L53C18

Thank you! Am I able to add new symbols when training from a pre-trained model? The latin model has ª but not º which we see extensively in our documents.

Gistix avatar Sep 30 '23 21:09 Gistix

No. You can't add new symbols to the pre-trained model as it will shoot the error as the number of activations in the last layer will be mismatched. I am unsure about this but you can try changing the new_prediction: True parameter and adding the new characters to characters: in the config file. Attaching code link for reference about model initialization: https://github.com/JaidedAI/EasyOCR/blob/c999505ef6b43be1c4ee36aa04ad979175178352/trainer/train.py#L71C9-L81C101

akshayrakate avatar Oct 30 '23 12:10 akshayrakate