EasyOCR
EasyOCR copied to clipboard
Training a custom OCR
while Training i get training and validation accuracy of about 90% but when i test the custom model my accuracy shows less than 5%. can u provide step to use custom training .I am using the steps provided in "https://www.youtube.com/watch?v=-j3TbyceShY&t=207s". Also how can i use trained weights after 20,000 iteration to retrain for another 20000.Means how should i use this or another pretrained weights for transfer learning. The repository doesnot give a clear idea of this.
For using a pre-trained or last fine-tuned model, add the pre-trained or last fine-tuned model in trainer/saved_models
folder and mention model path mentioned in en_filtered_config.yaml
file to saved_model: '' #'saved_models/en_filtered/iter_300000.pth'
- You can download models using recognition_models URLs mentioned in the ``easyocr/config.py``` file. If you want to train gen1 models then change parameters as per requirements
optim: adam / adadelta
lr : # Make is small like < 0.001
FeatureExtraction : VGG / ResNet
input_channel: 1
output_channel: 512
hidden_size: 512
new_prediction: False
freeze_FeatureFxtraction: False # To freeze VGG/ ResNet weights
freeze_SequenceModeling: False # To freeze BiLSTM weights
characters: # Mention exact character_list found in recognition_models dictionary
recognition_models dictionary = https://github.com/JaidedAI/EasyOCR/blob/c999505ef6b43be1c4ee36aa04ad979175178352/easyocr/config.py#L53C1-L53C18
For using a pre-trained or last fine-tuned model, add the pre-trained or last fine-tuned model in
trainer/saved_models
folder and mention model path mentioned inen_filtered_config.yaml
file tosaved_model: '' #'saved_models/en_filtered/iter_300000.pth'
- You can download models using recognition_models URLs mentioned in the ``easyocr/config.py``` file. If you want to train gen1 models then change parameters as per requirements
optim: adam / adadelta lr : # Make is small like < 0.001 FeatureExtraction : VGG / ResNet input_channel: 1 output_channel: 512 hidden_size: 512 new_prediction: False freeze_FeatureFxtraction: False # To freeze VGG/ ResNet weights freeze_SequenceModeling: False # To freeze BiLSTM weights characters: # Mention exact character_list found in recognition_models dictionary
recognition_models dictionary = https://github.com/JaidedAI/EasyOCR/blob/c999505ef6b43be1c4ee36aa04ad979175178352/easyocr/config.py#L53C1-L53C18
Thank you! Am I able to add new symbols when training from a pre-trained model? The latin model has ª but not º which we see extensively in our documents.
No. You can't add new symbols to the pre-trained model as it will shoot the error as the number of activations in the last layer will be mismatched.
I am unsure about this but you can try changing the new_prediction: True
parameter and adding the new characters to characters:
in the config file.
Attaching code link for reference about model initialization: https://github.com/JaidedAI/EasyOCR/blob/c999505ef6b43be1c4ee36aa04ad979175178352/trainer/train.py#L71C9-L81C101