yolov5
yolov5 copied to clipboard
Best pratices to train an license plate OCR model
Search before asking
- [X] I have searched the YOLOv5 issues and discussions and found no similar questions.
Question
Hello everyone! I'm trying to train a model for license plate recognition (OCR). I have a big dataset with thousands images. Some classes are more present than others. If someone can help me, I'll be very thankful.
- What's the best configuration to train and have good results?
- Is it necessary to change any "anchor" or "layers" values? What is recommended?
- About augumentations, how can I disable that, in view of I'm training for characteres and they can not be augmented?
Images to train are from 80x30 to 300x150. Images of real scenarios license plates.
- With this images dimensions (Images that I'll use to inference are very similar.), what is the best --image-size to train?
I have some doubts about epochs number, batch size, image size... Someone could help me, please? Thanks so much
Additional
No response
you can try paddleOCR
Hi @liuyajian . Thanks for your reply. I've tried, but accuracy of paddleOCR is not so good. I have a rich dataset of license plates, and I think yolov5 can give me better accuracy. I've already made another object detections, but with bigger images (as to train as to inference). For license plates, images and objects to detect are smaller, and the "objetcs" to detect always will have the same aspec ratio.
YOLOv5 isn't an OCR tool - It's used for object detection. To use YOLOv5 to perform OCR on license plates, you'd have to create bounding boxes and labels associated with each class (i.e. Letters of the Alphabet/Numbers) on EACH license plate. So if a plate was "ABC-123", you would draw 6 (or 7 if you want hyphens) bounding boxes on the images and assign them to the respective class.
Afterwards, you'd still need to perform some kind of post-processing to convert the detections to text. You'd also probably want to use something like agnostic NMS to prevent double-calls.
Hello @wolfpack12. Yes, I did it. My dataset is all labeled in each chararcter as different classes. I think I didn't expressed correctly. I use yolov5 for another object detections, but input images and images I used to train are bigger (at least 1280x720, or bigger). In this case of license plates, I'm using small images, and results of inference are not good. Do you know if I change anchors and/or layers configurations, it can increase accuracy? If yes, what configurations should I change?
Iβd recommend cropping your training set to look like your test set. If you canβt do that, try and letterbox your small images to the scale of your training set. The closer you can make the training image look like your test set, the better your performance should be.
Sure @wolfpack12 . All images are similar . Training and test set are images of only license plates (already cropped) of real world. And all of them are from about 80x30 to 300x150 . My doubt is if anchor and labels parameters can increase final performance and accuracy. If yes, what are the best parameters for that?
i trained similar dataset before, my task was OCR mrz on idcard. i started tuning model with yolov5s so i can tune faster then u can try yolov5m for better results.
- With this images dimensions (Images that I'll use to inference are very similar.), what is the best --image-size to train? => my best image size is 384 but it has 3 line, each line has 10 digit, i think u can try 96 to 160
- About augumentations, how can I disable that, in view of I'm training for characteres and they can not be augmented? => in my case, i used hard augumentations for best results
- Is it necessary to change any "anchor" or "layers" values? What is recommended? => i use same default
- What's the best configuration to train and have good results? => u can increase cls or obj cause u need correct classify not box. Yolov5 use p3, p4, p5. u can try p3 only, p4 only... cause characteres almost same size
@diegotecrede Hi~,i am also doing OCR task.Do you know how to change the input size?I don't want the input size width and height is equal,Ideally,i want the height is 96,width is 416.Do you know how to change the code?
i trained similar dataset before, my task was OCR mrz on idcard. i started tuning model with yolov5s so i can tune faster then u can try yolov5m for better results.
- With this images dimensions (Images that I'll use to inference are very similar.), what is the best --image-size to train? => my best image size is 384 but it has 3 line, each line has 10 digit, i think u can try 96 to 160
- About augumentations, how can I disable that, in view of I'm training for characteres and they can not be augmented? => in my case, i used hard augumentations for best results
- Is it necessary to change any "anchor" or "layers" values? What is recommended? => i use same default
- What's the best configuration to train and have good results? => u can increase cls or obj cause u need correct classify not box. Yolov5 use p3, p4, p5. u can try p3 only, p4 only... cause characteres almost same size
How can I increase cls or obj? And how can I use only p3 or only p4? What should I do in which files?
Hi @Henryplay . I think you can't define exactly width and height, but only --img-size parameter with a multiple of 32 number, it will resize your images automatically.
@diegotecrede
-
box: 0.05 # box loss gain cls: 0.3 # cls loss gain obj: 0.7 # obj loss gain (scale with pixels)
- u can change yolov5-p34.yaml
π Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.
Access additional YOLOv5 π resources:
- Wiki β https://github.com/ultralytics/yolov5/wiki
- Tutorials β https://docs.ultralytics.com/yolov5
- Docs β https://docs.ultralytics.com
Access additional Ultralytics β‘ resources:
- Ultralytics HUB β https://ultralytics.com/hub
- Vision API β https://ultralytics.com/yolov5
- About Us β https://ultralytics.com/about
- Join Our Team β https://ultralytics.com/work
- Contact Us β https://ultralytics.com/contact
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!
Thank you for your contributions to YOLOv5 π and Vision AI β!