mmocr icon indicating copy to clipboard operation
mmocr copied to clipboard

For the Chinese text detection task, how to modify the configuration file when training dbnet on ICDAR2019?

Open ViviKing414 opened this issue 2 years ago • 2 comments

Looking forward to sharing from professionals,Thanks

ViviKing414 avatar Sep 09 '22 08:09 ViviKing414

Hi,

For training a text detector on datasets containing Chinese or other languages, it is not necessarily to modify too much of the configs. For example, IC17 (MLT) also contains multi-lingual text instances, you may find that PANet IC17 config is almost the same as its IC15 version.

However, one thing that might be notable is that our DBNet pretraining model was trained on pure English synthetic dataset,

https://github.com/open-mmlab/mmocr/blob/3c63f736cbaaa55c4794fa3e61150745fef46f10/configs/textdet/dbnet/dbnet_r50dcnv2_fpnc_1200e_icdar2015.py#L15

if you would like to train a Chinese text detector, you'd better to firstly pre-train your own model on a Chinese synthetic dataset.

xinke-wang avatar Sep 09 '22 08:09 xinke-wang

Thank you very much, I will try to download IC17 to train the text detection model first

ViviKing414 avatar Sep 09 '22 09:09 ViviKing414