GDAN
GDAN copied to clipboard
Reproduce results on the SUN dataset
Hi,
I run your code but get unsatisfactory result. In your paper, the results on SUN dataset are 38.1, 89.9, 53.4, but in the process of running your code, the results are 8.5, 89.9, 15.6. Maybe I ignored some important information? Look forward to your reply.
The results on CUB dataset are lower than your results, but the seen result is exactly the same as your seen result. But I have not rewritten your code at all. So this is a bit strange.
Hello, may I ask whitch dataset you did used? I have downloaded the dataset but there is no train_loc and val_loc in the proposed split, and then I extract them from trainval_loc according to trainclasses.txt and valclasses.txt. But the result is not so well at seen classes whitch is 0.26 on CUB dataset, while that of unseen is better, whitch is 0.34.
@Carey-cc could you provide an idea about how can I extract trainval_loc and val_loc?
@Carey-cc how can I prepare data for the cub dataset for training network?
@lerndeep
-
you can extract train_loc and val_loc accrording to the trainclasses.txt and valclasses.txt. python code: `def load_data(att_path, res_path, val_size, trainval_split=1): """ :param val_size: if there is no train_loc and val_loc, even without trainclasses.txt and valclasses.txt, random split train and val :param trainval_split: if there is no train_loc and val_loc, this parameter decide use which txt file. :return: """ att_feats_dat = sio.loadmat(str(att_path)) res_feats_dat = sio.loadmat(str(res_path))
features = res_feats_dat['features'].transpose() features /= np.max(features, axis=0) labels = res_feats_dat['labels'].squeeze().astype(int) - 1 allclasses_names = att_feats_dat['allclasses_names'].squeeze() allclasses_names = np.array([i[0] for i in allclasses_names])
att_feats = att_feats_dat['att'].transpose()
try: id_train = att_feats_dat['train_loc'+str(trainval_split)].squeeze() - 1 id_val = att_feats_dat['val_loc'+str(trainval_split)].squeeze() - 1 train_class = np.unique(labels[id_train]) val_class = np.unique(labels[id_val]) except KeyError: # if there is only trainval_loc but no train_loc and val_oc id_trainval = att_feats_dat['trainval_loc'].squeeze() - 1 labels_trainval = labels[id_trainval] try: # extract train_loc and val_loc according to trainclasses.txt and valclasses.txt path = os.path.abspath(os.path.dirname(att_path)) trainclasses_names = np.loadtxt(path + '/trainclasses'+str(trainval_split)+'.txt', dtype=str) valclasses_names = np.loadtxt(path + '/valclasses'+str(trainval_split)+'.txt', dtype=str) train_class = np.where(pd.Index(trainclasses_names).get_indexer(allclasses_names) >= 0)[0] val_class = np.where(pd.Index(valclasses_names).get_indexer(allclasses_names) >= 0)[0] except OSError: trainval_class = np.unique(labels_trainval) train_class, val_class = train_test_split(trainval_class, test_size=val_size, random_state=7) # extract train_loc and val_loc from trainval_loc # change labels_trainval to labels if you want to extract train_loc and val_loc from whole dataset id_train = id_trainval[np.where(pd.Index(pd.unique(train_class)).get_indexer(labels_trainval) >= 0)[0]] id_val = id_trainval[np.where(pd.Index(pd.unique(val_class)).get_indexer(labels_trainval) >= 0)[0]] print('train classes num: ', len(train_class)) print('val classes: \r\n', allclasses_names[val_class])
id_test_unseen = att_feats_dat['test_unseen_loc'].squeeze() - 1
try: id_test_seen = att_feats_dat['test_seen_loc'].squeeze() - 1 except KeyError: id_test_seen = None
num_class = att_feats.shape[0]
test_class = np.unique(labels[id_test_unseen])
if id_test_seen is not None: test_class_s = np.unique(labels[id_test_seen]) else: test_class_s = []
train_x = features[id_train] train_y = labels[id_train] train_data = list(zip(train_x, train_y))
val_x = features[id_val] val_y = labels[id_val] val_data = list(zip(val_x, val_y))
test_x = features[id_test_unseen] test_y = labels[id_test_unseen] test_data = list(zip(test_x, test_y))
if id_test_seen is not None: test_s_x = features[id_test_seen] test_s_y = labels[id_test_seen] test_data_s = list(zip(test_s_x, test_s_y)) print("test seen", len(test_s_y), len(np.unique(test_s_y))) else: test_data_s = []
class_label = dict() class_label['train'] = list(train_class) class_label['val'] = list(val_class) class_label['test'] = list(test_class) class_label['test_s'] = list(test_class_s) class_label['num_class'] = num_class return att_feats, train_data, val_data, test_data, test_data_s, class_label, allclasses_names`
you can also do this by MATLAB.
- you can just download the dataset from https://www.dropbox.com/sh/btoc495ytfbnbat/AAAaurkoKnnk0uV-swgF-gdSa?dl=0(provided by https://github.com/edgarschnfld/CADA-VAE-PyTorch). But this spilt is kind of strange. It extract train_loc and val_loc from the whole dataset.
Hello, when I reproduce the cub, why is the output result always 0, and the result of awa is not clear. Can you provide the model you have trained?