SciTSR icon indicating copy to clipboard operation
SciTSR copied to clipboard

Missing Images

Open ctensmeyer opened this issue 4 years ago • 1 comments

I downloaded the data from the google drive link, and several image files are empty (0 bytes), including: ./train/img/1807.04686v1.1.png ./train/img/1610.02534v1.3.png ./train/img/0804.1441v3.3.png ./train/img/1611.02944v1.9.png ./train/img/1712.01039v2.4.png ./train/img/1705.03385v1.2.png ./train/img/1702.00552v1.1.png ./train/img/0705.1956v1.13.png ./train/img/1803.01529v1.3.png ./train/img/1703.09695v1.1.png ./train/img/1803.04786v1.4.png ./train/img/1402.1107v1.1.png ./train/img/1711.01711v11.3.png ./train/img/1808.02152v1.1.png ./train/img/1611.01056v2.11.png ./train/img/1510.03820v4.1.png ./train/img/1603.07603v1.1.png ./train/img/1805.10499v1.1.png ./train/img/1605.03821v2.3.png ./train/img/1412.1842v1.3.png ./train/img/1604.08553v4.5.png ./train/img/1311.4166v1.6.png ./train/img/1210.6912v1.6.png

ctensmeyer avatar Mar 05 '20 16:03 ctensmeyer

It is possible because the script cropping image from pdf has some problem in it. It cannot deal with all the pdfs properly. You can just skip the empty images or check the pdfs manually. We will update the dataset as soon as possible.

DaDaMrX avatar Mar 07 '20 04:03 DaDaMrX