handwriting-ocr train a new model

trafficstars

How to train a new model with new data set

Feb 15 '18 13:02 slbinilkumar

Nice question. I am wondering the same. Please tell me how to train the model?

Feb 20 '18 13:02 yasersakkaf

Hi,

This is question depends on the model you want to train. All the notebooks that are for model training contain name Classifier. These notebooks load data from the data folder (if you didn't already, you have to download the data from provided URL), process them and train the model which is then saved in the models folder.

You don't have to do much more than replace the original date with yours and train the model. Your data have to be in the right format which depends on the type of a model. Often the data are stored as and image file with name in the format: label_timestamp.jpg.

If you need more details, please specify the model you want to train.

Feb 21 '18 21:02 Breta01

I wanted to train the word-classifier CTC. How to do it?

Feb 23 '18 12:02 yasersakkaf

OK, that's the easy one.

The training code is in this notebook: WordClassifier-CTC.ipynb. Currently, the data are loaded from folder data/words2/ (the location is parametr of loadWordsData()). In this folder I have images of words which are already normalized (grayscaled and with height: 60px). The words' images are named as word_timestamp.jpg (word stands for correct label and timestamp can be just random number). For example, following image is named as sell_15132719.jpg: sell_1513271957 8685486

The loadWordsData() loads grayscaled images and outputs numpy array of images and labels. The model is then trained and output into location defined by save_location variable.

I hope this helps.

Feb 24 '18 15:02 Breta01

What are the .txt files in data/words2? I am going to retrain the char classifier and it needs the .txt files. How can I generate .txt files for my data?

Jul 05 '18 13:07 mhsamavatian

This question is duplicate with #44

Jul 08 '18 14:07 Breta01

handwriting-ocr handwriting-ocr copied to clipboard

train a new model

handwriting-ocr
handwriting-ocr copied to clipboard