deep-text-recognition-benchmark icon indicating copy to clipboard operation
deep-text-recognition-benchmark copied to clipboard

Fine tuning problem! urgent

Open gitdeepheolp opened this issue 3 years ago • 4 comments

So i have trained from scratch using lmdb dataset from the repo then i tried to create my own dataset using trdg to add non latin characters and retrain the model. With the original dataset i got pretty good predictions but after i trained with the dataset that i created i got a very bad one. Please help ! Or should i train the model from scratch using non-latin characters ?

Another question is : after creating the dataset .mdb files i have to replace the datasets from validation folder also ?

This is confusing, we create the training model python3 create_lmdb_dataset.py --inputPath data/ --gtFile data/gt.txt --outputPath result/

But what about the validation datasets?

gitdeepheolp avatar May 18 '22 08:05 gitdeepheolp

Finally figured it out. You have to generate other images with trdg for validation. There's no mention of that or i missed it.

gitdeepheolp avatar May 18 '22 13:05 gitdeepheolp

So, i trained my model with english lmdb dataset then fine tuning with russian dataset and the model now recognize only the russian characters.....

gitdeepheolp avatar May 19 '22 20:05 gitdeepheolp

So, i trained my model with english lmdb dataset then fine tuning with russian dataset and the model now recognize only the russian characters.....

Could you please give me more detail? I trained this network for the Persian dataset from scratch, but the result is not good. How fine tunning on a network trained with a different dataset would be effective? Thanks

MHasanlou1 avatar Sep 06 '22 06:09 MHasanlou1

Hello, I am also working on this topic. I hope I can communicate with you via email. Thanks

ftmasadi avatar Oct 08 '22 11:10 ftmasadi