Handwriting-Transformers icon indicating copy to clipboard operation
Handwriting-Transformers copied to clipboard

For other language's handwriting generation--how to properly fine-tune the model?

Open rainchamber opened this issue 2 years ago • 4 comments

@ankanbhunia

While you mention: "You can train the model in any custom dataset other than IAM and CVL. The process involves creating a dataset_name.pickle file and placing it inside files folder. The structure of dataset_name.pickle is a simple python dictionary."

I go through the codes and the way you suggest seems to retrain the model on another dataset from scratch rather than fine-tuning your model. As your paper has not yet much discussed this, I come to ask your opinion. If I want to apply the model to generate other language's handwritings e.g. Japanese, is there a way to quickly fine-tune your model upon a new Japanese handwriting data? Or do I need to retrain it from scratch?

rainchamber avatar Jan 30 '23 14:01 rainchamber

I haven't tried to fine-tune for a different language, so I can not tell for sure what may happen. However, here's what I think should happen:  For a different language, you need to change the OCR network's last layer accordingly. I think the knowledge of a fully trained OCR network is very specific to that language. As a result, I believe there is no benefit to using OCR weights from a different language. In other words, you may need to initialize the OCR weights randomly. You can, however, reuse the weights of the generator and the discriminator from IAM pretrained weights. This way, it can potentially reduce the overall training time. 

If you proceed in this manner, you may encounter training instabilities though. To deal with this, you can pretrain the OCR separately before plugging it into the end-to-end training with the GAN. 

Having said that, I'd suggest retraining it from scratch. It will be more straightforward :).  

ankanbhunia avatar Jan 30 '23 16:01 ankanbhunia

@ankanbhunia Thanks for the reply!

Just a quick follow-up question: if I just tried to work on a historical English handwriting dataset (maybe extracted from historical handwriting script through old bibles), putting the OCR network issue aside, will you suggest training it from scratch or fine-tuning the model?

Thanks in advance for letting me know it! I'm trying to extend your cool paper to do some projects.

rainchamber avatar Jan 30 '23 19:01 rainchamber

In that case, I'd suggest to try first fine-tuning the model.

ankanbhunia avatar Jan 31 '23 16:01 ankanbhunia

Thanks!

rainchamber avatar Jan 31 '23 22:01 rainchamber