image-classification-caltech-256 icon indicating copy to clipboard operation
image-classification-caltech-256 copied to clipboard

How to split the dataset

Open WXIAO-TJ opened this issue 6 years ago • 1 comments

Hello, i want to know how do you split the "caltech-256" datasets into 16980 training images and 5120 testing images. Or where could i download the splited datasets directly? I will appreciate it if you could tell me about it.

WXIAO-TJ avatar Nov 16 '19 03:11 WXIAO-TJ

Hello, I have the same concern.

  1. After downloading caltech256, it has totally 257 sub_folders and one .txt file to list down all folder name. So I guess if I need to train a classifier on caltech256, I may need to create training set and evaluation set manually. Is it right? Or how do you build seperate dataset from caltech256 for training and validating?

  2. In your github, I could not find "train_no_resizing" and "val" but found "train_metadata.csv" and " **val_metadata.csv **" instead. Is it right to you these files and change it in the code?

train_folder = ImageFolder(data_dir + 'train_no_resizing', train_transform) val_folder = ImageFolder(data_dir + 'val', val_transform)

  1. I run the code but it poped up an error when data_dir is fixed "data_dir = '/home/ubuntu/data/' " If I need to change it, is it right to point to the folder keeping caltech256 or where? It seems not right. I do not know

Thanks

vietvo89 avatar Feb 26 '20 05:02 vietvo89