dlupi-heteroscedastic-dropout
dlupi-heteroscedastic-dropout copied to clipboard
Re-Creating the dataset
https://github.com/johnwlambert/dlupi-heteroscedastic-dropout/blob/da71881506a550d19a0879fc625d4ff609ac11e3/cnns/imagenet/create_bbox_dataset.py#L160-L172
I'm trying to re-create the exact dataset. I feel like all the shutil.copy lines here should be uncommented because otherwise, I'm just iterating over the data and not creating anything.
@johnwlambert thanks for providing your code to the community, I highly appreciate it.
As pointed out by @devansh20la I also have issues to create the dataset as described in the README. I uncommented the lines that @devansh20la described, that succesfully creates the training set but I have some problems creating the validation set.
I downloaded wget http://image-net.org/Annotation/Annotation.tar.gz and ILSVRC2016_CLS-LOC.tar.gz
I manually copied ../../Annotations/CLS-LOC/train to ILSVRC/Data/CLS-LOC/TrainAnnotation ../../Annotations/CLS-LOC/val to ILSVRC/Data/CLS-LOC/ValAnnotation
which seems to be expected in create_bbox_dataset.py self.train_annotation_path = os.path.join(self.imagenet_path, 'TrainAnnotation/train' ) self.val_annotation_path = os.path.join( self.imagenet_path, 'ValAnnotation/val' )
When running python cnns/imagenet/create_bbox_dataset.py I receive the following message "We have a massive error: n04562935 not in val_synsets"
It seems as if the validation set is not created. I am a bit confused with the "validation_synsets". When looking into the ILSVRC2016_CLS-LOC.tar.gz dataset there exist synsets for training but not for validation or testing.
The README also states "At this point, we'll arrange the image data into three folders: "train", "val", and "test".
6.3G val.zip 56G train.zip" Do we have to manually create this splits or do we use the train/val/test split of the original Imagenet data?
What is the role of this file http://image-net.org/Annotation/Annotation.tar.gz? create_bbox_dataset.py does not use this right?
Thank you so much. Best, Andreas
@aeitel Hi Andreas, thanks for your interest. It's been about 2 years since I've looked at this code and I don't have a copy of ImageNet on my computer right now, but I will start the download today and let you know once I've refreshed my memory. Thanks for your patience.
@aeitel Are your image files named ILSVRC2012_img_train.tar
, ILSVRC2012_img_val.tar
?
@johnwlambert Thanks for your quick response. I only downloaded this data ILSVRC2016_CLS-LOC.tar.gz. I have not dowloaded ILSVRC2012_img_train.tar or ILSVRC2012_img_val.tar. That means I need to dowload all three?
Hi, Did you guys solve the problem? I am not sure about the validation part. I have downloaded the ILSVRC2016_CLS-LOC.tar.gz. Thank you.
Hi John, had to do some changes to create_imagenet_test_set.py and create_bbox_dataset.py. I used the ILSVRC2016_CLS-LOC.tar.gz. Attached the two modified files.
I did not look further into your method because I was not able reproduce the results of the paper when training again. This might be because of different python, pytorch versions or some other small changes I had to do in order to make your code run on my machine. create_dataset.zip
Best, Andreas
Hi @ck-amrahd , could you explain in a few more details about your confusion? The CLS-LOC dataset is the right one.
Andreas, thanks for your efforts to port the code! I promise the method is fully reproducible :-) We were able to generate a full curve showing the effect:
Hopefully after the ECCV deadline I can port some of the code to Pytorch 1.4 and add some more example training commands.
Best wishes, John
hi @aeitel, Thank you for the code, I got the dataset from kaggle. The Annotation folder contains two folders train and val. The train folder has subfolders for different class and inside each subfolders, it has the xml files but for the val folder, it has xml files directly inside it. Could you please explain how to create ValAnnotation/val, from this one. Thank you.
I also can not download images from Imagenet as the competition is already closed and I can not register and download, I tried to download original images but they do not provide access.
Hi @ck-amrahd, try with the two files I changed, see create_dataset.zip in my previous post. It might be that I had to manually change the folder structure as well.
@aeitel, Thank you. I am able to run the code but inside val folder I have xml files directly instead of class folders. I think it's because I got dataset from kaggle. So I am just using the train set for now. Do you have similar folder structure?
also @johnwlambert , when I try to start training, it tries to load pretrained dlupi model but I don't have it, can I start from torchvision.models.vgg16? Thank you.
@ck-amrahd I think I also had the xml directly in the val folder.
@aeitel , Thank you. But it will create problem because for extracting the synset, the code has os.listdir(val_folder), it will return file names instead of synset folder names, what do you think?