handwriting-ocr icon indicating copy to clipboard operation
handwriting-ocr copied to clipboard

train a new model

Open slbinilkumar opened this issue 7 years ago • 6 comments
trafficstars

How to train a new model with new data set

slbinilkumar avatar Feb 15 '18 13:02 slbinilkumar

Nice question. I am wondering the same. Please tell me how to train the model?

yasersakkaf avatar Feb 20 '18 13:02 yasersakkaf

Hi,

This is question depends on the model you want to train. All the notebooks that are for model training contain name Classifier. These notebooks load data from the data folder (if you didn't already, you have to download the data from provided URL), process them and train the model which is then saved in the models folder.

You don't have to do much more than replace the original date with yours and train the model. Your data have to be in the right format which depends on the type of a model. Often the data are stored as and image file with name in the format: label_timestamp.jpg.

If you need more details, please specify the model you want to train.

Breta01 avatar Feb 21 '18 21:02 Breta01

I wanted to train the word-classifier CTC. How to do it?

yasersakkaf avatar Feb 23 '18 12:02 yasersakkaf

OK, that's the easy one.

The training code is in this notebook: WordClassifier-CTC.ipynb. Currently, the data are loaded from folder data/words2/ (the location is parametr of loadWordsData()). In this folder I have images of words which are already normalized (grayscaled and with height: 60px). The words' images are named as word_timestamp.jpg (word stands for correct label and timestamp can be just random number). For example, following image is named as sell_15132719.jpg: sell_1513271957 8685486

The loadWordsData() loads grayscaled images and outputs numpy array of images and labels. The model is then trained and output into location defined by save_location variable.

I hope this helps.

Breta01 avatar Feb 24 '18 15:02 Breta01

What are the .txt files in data/words2? I am going to retrain the char classifier and it needs the .txt files. How can I generate .txt files for my data?

mhsamavatian avatar Jul 05 '18 13:07 mhsamavatian

This question is duplicate with #44

Breta01 avatar Jul 08 '18 14:07 Breta01