deep-text-recognition-benchmark
deep-text-recognition-benchmark copied to clipboard
[Question] Configuring the dataset for custom training
I am in the process of setting up a custom dataset for custom training.
Q1. Is it okay to include images that contain words not used in the dataset? For example, if the words I want to train on are 1,2,3, can I include an image and text with "1ABC2E3" in the dataset? ( like : "1ABC2E3.jpg","1ABC2E3" )
Q2. The default values for imgH and imgW during training are 32x100. Is it okay to use these values even if the aspect ratio of the training image is 1:1?
For example, when the training image is roughly 64x70, which is closer to a square.
Q3. If I want to keep the numbers as they are and only personally add characters, is it acceptable to only add those specific character images?
Q4. use preprocessing img for train is better?
Let me explain what I know. I hope this helps.
Q1. you should edit your character.(Please refer to --> parser.add_argument('--character', ....)) Q3. The input image must be a cropped image of the text area, and the label must be text corresponding to the area.
@Seoung-wook Thanks a lot