chinese-subtitle-ocr 关于images.csv的问题

关于images.csv的问题

Open qimingfeijin opened this issue 6 years ago • 5 comments

trafficstars

运行download_images.py报错，错误提示为No such file or directory: 'images.csv'，请问我该怎么解决

Feb 18 '19 12:02 qimingfeijin

Hi,

instead of download_images.py, just use the COCO dataset. It is much smaller and for OCR you actually don't need so many images. You can directly download 5K images here: http://images.cocodataset.org/zips/val2017.zip. Then you don't need download_images.py

Hope this helps.

Feb 19 '19 19:02 lars76

感谢你的帮助与分享。我想做中文的文本检测，需要一些中文的图片训练和测试，请问你的中文数据集是在哪里下载的？

Feb 20 '19 02:02 qimingfeijin

I generated the dataset myself by using a subtitle file (srt) and then doing manual annotation. I don't think that there are any datasets that you can download.

Most papers actually generate their own training/test images by creating random text on images. Look at this github project https://github.com/JarveeLee/SynthText_Chinese_version and the corresponding paper is described here https://blog.csdn.net/u010167269/article/details/52389676. I tried something similar myself and it produced equal or better results than a real dataset.

Feb 20 '19 22:02 lars76

我明白了，谢谢你的分享

Feb 21 '19 01:02 qimingfeijin

@lars76 can you share your method for synthesise dataset？

May 06 '19 06:05 wushilian

chinese-subtitle-ocr chinese-subtitle-ocr copied to clipboard

关于images.csv的问题

chinese-subtitle-ocr
chinese-subtitle-ocr copied to clipboard