computervision-recipes icon indicating copy to clipboard operation
computervision-recipes copied to clipboard

[BUG] Redesign utils_cv.common.data.unzip_url() for unusual scenario

Open simonzhaoms opened this issue 5 years ago • 0 comments

Description

Usually, the directory name is the same as the zipball without the .zip extension. For example, all files in odFridgeObjectsTiny.zip will be in the odFridgeObjectsTiny directory after being unzipped. But sometimes, it is not the case. After unzip annotations_trainval2017.zip, the files in annotations_trainval2017.zip are put under the annotations directory, not annotations_trainval2017.

In which platform does it happen?

All platforms

How do we replicate the issue?

>>> from utils_cv.common.data import unzip_url
>>> url = "http://images.cocodataset.org/annotations/annotations_trainval2017.zip"
>>> path = unzip_url(url, exist_ok=True)
>>> path
'/home/simon/repo/fork-cvbp/data/annotations_trainval2017'
>>> Path(path).is_dir()
False
>>> from pathlib import Path
>>> from pprint import pprint
>>> pprint([i for i in Path(path).parent.iterdir() if i.is_dir()])
[PosixPath('/home/simon/repo/fork-cvbp/data/annotations'),
 PosixPath('/home/simon/repo/fork-cvbp/data/PennFudanPed'),
 PosixPath('/home/simon/repo/fork-cvbp/data/odFridgeObjects'),
 PosixPath('/home/simon/repo/fork-cvbp/data/odFridgeObjectsTiny')]
>>> from zipfile import ZipFile
>>> a = ZipFile(Path(path).parent / url.split('/')[-1])
>>> a.printdir()
File Name                                             Modified             Size
annotations/instances_train2017.json           2017-09-01 19:02:24    469785474
annotations/instances_val2017.json             2017-09-01 19:02:32     19987840
annotations/captions_train2017.json            2017-09-01 19:04:56     91865115
annotations/captions_val2017.json              2017-09-01 19:04:58      3872473
annotations/person_keypoints_train2017.json    2017-09-01 19:04:32    238884731
annotations/person_keypoints_val2017.json      2017-09-01 19:04:38     10020657
$ ls /home/simon/repo/fork-cvbp/data/
annotations                   odFridgeObjectsTiny.zip
annotations_trainval2017.zip  odFridgeObjects.zip
odFridgeObjects               PennFudanPed
odFridgeObjectsTiny           PennFudanPed.zip
$ ls /home/simon/repo/fork-cvbp/data/annotations
captions_train2017.json   instances_val2017.json
captions_val2017.json     person_keypoints_train2017.json
instances_train2017.json  person_keypoints_val2017.json

Expected behavior (i.e. solution)

It's better to redesign the utils_cv.common.data.unzip_url() function to deal with this scenario but not change the existing code that use utils_cv.common.data.unzip_url() in order not to degrade the user experience.

Other Comments

simonzhaoms avatar Oct 14 '19 01:10 simonzhaoms