keras-ocr
keras-ocr copied to clipboard
Weakly-Supervised Training for Detection
The CRAFT authors used a weakly-supervised training method to handle the fact that most datasets don't annotate at the character level. I saw in your docs that a future release will support weakly-supervised training of the detector model, presumably following section 3.2.2 of the original paper. Have you made a start on this and, if so, do you have an idea of when this would be released? I might have time to try this myself, but figured I'd ask first.
Also, kudos and thanks for this cool project!
Great question -- weakly supervised training has been very challenging to implement (even more than I anticipated). The original authors have not provided an official implementation (not that they should feel obligated to do so) and the dribs and drabs in the issues for the official repository have been difficult for me to follow.
As an alternative, I've been working on generating a labeled dataset by training a character detector on cropped bounding boxes and then using it to label the character boxes in a word-level labeled dataset. The results have been okay but not great. I've put what I have so far in labeling such a dataset using this approach in this experimental function. I've tried to document the approach used to create those labels in this experimental notebook.
I would very much like to have a functioning end-to-end weakly supervised pipeline. Being realistic, I don't know when / if I'll have the available time to give that the focus it needs. If you want to take a crack at it, I would be exceedingly grateful, as I'm sure many others would be.
I found several implementations, maybe that helps.
- https://github.com/RubanSeven/CRAFT_keras/blob/master/train.py
- https://github.com/backtime92/CRAFT-Reimplementation/blob/master/watershed.py
- https://github.com/autonise/CRAFT-Remade/blob/master/train_weak_supervision/trainer.py
- https://github.com/dotieuthien/CRAFT/blob/master/process_label.py
@faustomorales, I want to ask you a question, this code https://keras-ocr.readthedocs.io/en/latest/examples/fine_tuning_detector.html is weakly supervised training, I use it to train icdar 2013 dataset,but the result is very bad