EasyOCR
EasyOCR copied to clipboard
Training a custom model : how to improve accuracy?
Hello All, I am trying to train a new model dedicated to french characters set and domain specific fonts set. After a bunch of difficulties ;-) I managed to make the training and testing working!
I looked at the train.py code and as far as I understand:
- each training loop consumes 32 images (1 torch data batch).
- A round of 10000 batches is necessary to get the fisrt model checkpoint.
To ensure this traning without oversampling, I created a set of 320,000 (10000*32) images/labels. Is this way of thinking correct ?
Training works, I get the iter_10000.pth model. Test shows accuracy around 90 %. Running to the second checkpoint, trainer delivers the iter_20000.pth model, but accuracy is not better, even worse. Moreover, when third turn starts I get a CUDA memory overflow (RTX 3060 8GB in my box).
Questions: How many images where used to train the latin_g2.pth model? What was the size of the images? How manay words were present in each image? What kind of GPU was used? How long time did this training last?
Any advice is greatly apprciated. Thanks a lot AV
I'd also like to now this, the default pt model works incredibly well, but my own model trained from scratch or using latin_g2.pth as a starting point does not perform as well. If possible please share how the latin_g2.pth was trained, which dataset was used and its .yaml config file.
@averatio I think you need to decrease the batch size, I'm training on a 12GB 3060 and I've had issues CUDA memory issues with the default settings.