GDAN icon indicating copy to clipboard operation
GDAN copied to clipboard

Reproduce results on the SUN dataset

Open Chen-Song opened this issue 5 years ago • 6 comments

Hi, I run your code but get unsatisfactory result. In your paper, the results on SUN dataset are 38.1, 89.9, 53.4, but in the process of running your code, the results are 8.5, 89.9, 15.6. Maybe I ignored some important information? Look forward to your reply. Screenshot from 2019-09-03 10-30-23

Chen-Song avatar Sep 03 '19 02:09 Chen-Song

Screenshot from 2019-09-03 21-35-53 The results on CUB dataset are lower than your results, but the seen result is exactly the same as your seen result. But I have not rewritten your code at all. So this is a bit strange.

Chen-Song avatar Sep 03 '19 13:09 Chen-Song

Hello, may I ask whitch dataset you did used? I have downloaded the dataset but there is no train_loc and val_loc in the proposed split, and then I extract them from trainval_loc according to trainclasses.txt and valclasses.txt. But the result is not so well at seen classes whitch is 0.26 on CUB dataset, while that of unseen is better, whitch is 0.34.

jwliu-cc avatar Mar 27 '20 10:03 jwliu-cc

@Carey-cc could you provide an idea about how can I extract trainval_loc and val_loc?

lerndeep avatar May 08 '20 10:05 lerndeep

@Carey-cc how can I prepare data for the cub dataset for training network?

lerndeep avatar May 08 '20 10:05 lerndeep

@lerndeep

  1. you can extract train_loc and val_loc accrording to the trainclasses.txt and valclasses.txt. python code: `def load_data(att_path, res_path, val_size, trainval_split=1): """ :param val_size: if there is no train_loc and val_loc, even without trainclasses.txt and valclasses.txt, random split train and val :param trainval_split: if there is no train_loc and val_loc, this parameter decide use which txt file. :return: """ att_feats_dat = sio.loadmat(str(att_path)) res_feats_dat = sio.loadmat(str(res_path))

    features = res_feats_dat['features'].transpose() features /= np.max(features, axis=0) labels = res_feats_dat['labels'].squeeze().astype(int) - 1 allclasses_names = att_feats_dat['allclasses_names'].squeeze() allclasses_names = np.array([i[0] for i in allclasses_names])

    att_feats = att_feats_dat['att'].transpose()

    try: id_train = att_feats_dat['train_loc'+str(trainval_split)].squeeze() - 1 id_val = att_feats_dat['val_loc'+str(trainval_split)].squeeze() - 1 train_class = np.unique(labels[id_train]) val_class = np.unique(labels[id_val]) except KeyError: # if there is only trainval_loc but no train_loc and val_oc id_trainval = att_feats_dat['trainval_loc'].squeeze() - 1 labels_trainval = labels[id_trainval] try: # extract train_loc and val_loc according to trainclasses.txt and valclasses.txt path = os.path.abspath(os.path.dirname(att_path)) trainclasses_names = np.loadtxt(path + '/trainclasses'+str(trainval_split)+'.txt', dtype=str) valclasses_names = np.loadtxt(path + '/valclasses'+str(trainval_split)+'.txt', dtype=str) train_class = np.where(pd.Index(trainclasses_names).get_indexer(allclasses_names) >= 0)[0] val_class = np.where(pd.Index(valclasses_names).get_indexer(allclasses_names) >= 0)[0] except OSError: trainval_class = np.unique(labels_trainval) train_class, val_class = train_test_split(trainval_class, test_size=val_size, random_state=7) # extract train_loc and val_loc from trainval_loc # change labels_trainval to labels if you want to extract train_loc and val_loc from whole dataset id_train = id_trainval[np.where(pd.Index(pd.unique(train_class)).get_indexer(labels_trainval) >= 0)[0]] id_val = id_trainval[np.where(pd.Index(pd.unique(val_class)).get_indexer(labels_trainval) >= 0)[0]] print('train classes num: ', len(train_class)) print('val classes: \r\n', allclasses_names[val_class])

    id_test_unseen = att_feats_dat['test_unseen_loc'].squeeze() - 1

    try: id_test_seen = att_feats_dat['test_seen_loc'].squeeze() - 1 except KeyError: id_test_seen = None

    num_class = att_feats.shape[0]

    test_class = np.unique(labels[id_test_unseen])

    if id_test_seen is not None: test_class_s = np.unique(labels[id_test_seen]) else: test_class_s = []

    train_x = features[id_train] train_y = labels[id_train] train_data = list(zip(train_x, train_y))

    val_x = features[id_val] val_y = labels[id_val] val_data = list(zip(val_x, val_y))

    test_x = features[id_test_unseen] test_y = labels[id_test_unseen] test_data = list(zip(test_x, test_y))

    if id_test_seen is not None: test_s_x = features[id_test_seen] test_s_y = labels[id_test_seen] test_data_s = list(zip(test_s_x, test_s_y)) print("test seen", len(test_s_y), len(np.unique(test_s_y))) else: test_data_s = []

    class_label = dict() class_label['train'] = list(train_class) class_label['val'] = list(val_class) class_label['test'] = list(test_class) class_label['test_s'] = list(test_class_s) class_label['num_class'] = num_class return att_feats, train_data, val_data, test_data, test_data_s, class_label, allclasses_names`

you can also do this by MATLAB.

  1. you can just download the dataset from https://www.dropbox.com/sh/btoc495ytfbnbat/AAAaurkoKnnk0uV-swgF-gdSa?dl=0(provided by https://github.com/edgarschnfld/CADA-VAE-PyTorch). But this spilt is kind of strange. It extract train_loc and val_loc from the whole dataset.

jwliu-cc avatar May 09 '20 06:05 jwliu-cc

Hello, when I reproduce the cub, why is the output result always 0, and the result of awa is not clear. Can you provide the model you have trained?

Programmergg avatar Sep 22 '20 13:09 Programmergg