keras-ocr icon indicating copy to clipboard operation
keras-ocr copied to clipboard

Weakly-Supervised Training for Detection

Open csmcallister opened this issue 5 years ago • 3 comments

The CRAFT authors used a weakly-supervised training method to handle the fact that most datasets don't annotate at the character level. I saw in your docs that a future release will support weakly-supervised training of the detector model, presumably following section 3.2.2 of the original paper. Have you made a start on this and, if so, do you have an idea of when this would be released? I might have time to try this myself, but figured I'd ask first.

Also, kudos and thanks for this cool project!

csmcallister avatar Mar 02 '20 17:03 csmcallister

Great question -- weakly supervised training has been very challenging to implement (even more than I anticipated). The original authors have not provided an official implementation (not that they should feel obligated to do so) and the dribs and drabs in the issues for the official repository have been difficult for me to follow.

As an alternative, I've been working on generating a labeled dataset by training a character detector on cropped bounding boxes and then using it to label the character boxes in a word-level labeled dataset. The results have been okay but not great. I've put what I have so far in labeling such a dataset using this approach in this experimental function. I've tried to document the approach used to create those labels in this experimental notebook.

I would very much like to have a functioning end-to-end weakly supervised pipeline. Being realistic, I don't know when / if I'll have the available time to give that the focus it needs. If you want to take a crack at it, I would be exceedingly grateful, as I'm sure many others would be.

faustomorales avatar Mar 07 '20 21:03 faustomorales

I found several implementations, maybe that helps.

  • https://github.com/RubanSeven/CRAFT_keras/blob/master/train.py
  • https://github.com/backtime92/CRAFT-Reimplementation/blob/master/watershed.py
  • https://github.com/autonise/CRAFT-Remade/blob/master/train_weak_supervision/trainer.py
  • https://github.com/dotieuthien/CRAFT/blob/master/process_label.py

threefoldo avatar May 20 '20 16:05 threefoldo

@faustomorales, I want to ask you a question, this code https://keras-ocr.readthedocs.io/en/latest/examples/fine_tuning_detector.html is weakly supervised training, I use it to train icdar 2013 dataset,but the result is very bad

Devin19910617 avatar Jul 02 '20 12:07 Devin19910617