mmocr icon indicating copy to clipboard operation
mmocr copied to clipboard

Performance on TextOCR Dataset

Open jkcg-learning opened this issue 3 years ago • 6 comments

Motivation

Improve the benchmark performance of all algorithms based on TextOCR dataset released by Facebook AI research team

Related resources https://textvqa.org/textocr

Overview TextOCR requires models to perform text-recognition on arbitrary shaped scene-text present on natural images. TextOCR provides ~1M high quality word annotations on TextVQA images allowing application of end-to-end reasoning on downstream tasks such as visual question answering or image captioning.

Statistics 28,134 natural images from TextVQA 903,069 annotated scene-text words 32 words per image on average

jkcg-learning avatar Jun 03 '21 07:06 jkcg-learning

Thanks for your suggestion. And we will take it into our July plan.

cuhk-hbsun avatar Jun 04 '21 02:06 cuhk-hbsun

Team, is this in consideration for the next release ?

jkcg-learning avatar Jul 13 '21 08:07 jkcg-learning

We already support TextOCR dataset now (https://mmocr.readthedocs.io/en/latest/datasets.html)

gaotongxiao avatar Jul 13 '21 08:07 gaotongxiao

Thanks for adding this dataset for the purpose of training...

Shall we also expect a model checkpoint particularly trained based on this dateset from the team..

jkcg-learning avatar Jul 13 '21 09:07 jkcg-learning

Currently we only have DBNet pretrained on TextOCR. Do you have any requests for the model type and the specific datasets that it is pretrained on? We may add that to our plan if we believe that it also benefits our community.

gaotongxiao avatar Jul 13 '21 10:07 gaotongxiao

https://mmocr.readthedocs.io/en/latest/textdet_models.html#icdar2015

image

Is it possible to update the DBNet model zoo with the details of your model training and the metric levels for TextOCR dataset ..

jkcg-learning avatar Jul 13 '21 10:07 jkcg-learning