deep-text-recognition-benchmark Which data is good to train?

Which data is good to train?

Open kimlia545 opened this issue 4 years ago • 3 comments

Paper said "This result showed that the diversity of training data can be more important than the number of training examples, and that the effects of using different training datasets is more complex than simply concluding more is better." You uesd the MJSynth and SynthText in combination. I want to train Korean language data. Should I use data with various colors, fonts, backgrounds, widths, gradients, distortions, and blurs?

Jan 08 '21 01:01 kimlia545

I think using rgb images does not help because of the network input has one channel default . However you can change it by opt.rgb=True.

Jan 12 '21 06:01 yakhyo

@yakhyo Thanks

Feb 03 '21 06:02 kimlia545

Hi, @kimlia545, do you happen to have a pretrained model for Korean (or Korean + English) language that you can share? As their site only supports around 10 tests per day, I would like to have a separate model on premise. Thank you!

Jun 21 '22 01:06 bit-scientist

deep-text-recognition-benchmark deep-text-recognition-benchmark copied to clipboard

Which data is good to train?

deep-text-recognition-benchmark
deep-text-recognition-benchmark copied to clipboard