DeepTextSpotter icon indicating copy to clipboard operation
DeepTextSpotter copied to clipboard

regarding SynthText dataset

Open mowkee opened this issue 6 years ago • 1 comments

Good Day

I've been reading the dup_boxes_synth_text.py script from your data conversions scripts repository. If Im not mistaken this is what one has to use to convert the SynthText dataset for training.

Anyways, on lines 59 through 62 there are three files namely: imnames.np.npy, wordBB.np.npy and gt_txt.npz.

My question is how should I generate these files?

do I have to modify the gen.py script from SynthText github repository to generate them or they are created from the gt.mat file downloaded from the pregenerated SynthText dataset with 800000 images linked in SynthText github repository?

if yes could you tell me the format of the data within these files or point to / provide a script to do this?

your help is greatly appreciated

mowkee avatar Sep 27 '18 12:09 mowkee

@MichalBusta

Anything you can suggest me?

Please help

mowkee avatar Sep 30 '18 10:09 mowkee