vqa-winner-cvprw-2017
vqa-winner-cvprw-2017 copied to clipboard
30% accuracy in training
I download the code and try to reimplement your score on VQA 2.0 set. Since my computer cannot support the whole training data, I split the - vqa_train_final.json and - coco_features.npy into 7 folds, each set of them grouped by imageid.(like vqa_train_final.0.json contains image ids : [1, 2, 3] the coco_features.0.npy contains image features of [1, 2, 3] and other sets doesnot have any data related to image [1, 2, 3]) I train the model in two way, one is loading the data from 0 to 6 in each epoch and repeat 50 times. the other is loading each data set training 50 epochs and then move on to the next data set. However, both of them result in a low accuracy, 30% or so. the tokenized question, coco 36 features is downloaded from the link you described. what do you think might be the cause? Thanks
this is how I split the data
def split_images():
list_train = os.listdir('G:/train2014/')
list_train.remove('COCO_train2014_000000372405.jpg')
ids = [int(f[15:27]) for f in list_train]
length = int(len(ids)/7)+1
ids_list = [ids[i:i + length] for i in range(0, len(ids), length)]
for i in range(len(ids_list)):
np.savetxt("split/imageIds.train." + str(i), ids_list[i], fmt='%d')
def split_json():
train = json.load(open('vqa_train_final.json'))
for i in range(7):
ids = np.loadtxt("split/imageIds.train." + str(i)).astype(int)
s = set(ids)
data = []
for j in range(len(train)):
if train[j]['image_id'] in s:
data.append(train[j])
json.dump(data, open('split/vqa_train_final.json.' + str(i), 'w'))
for k in range(7):
ids = np.loadtxt("split/imageIds.train." + str(k)).astype(int)
s = set(ids)
in_data = {}
with open(infile, "rt") as tsv_in_file:
reader = csv.DictReader(tsv_in_file, delimiter='\t', fieldnames = FIELDNAMES)
i = 0
for item in reader:
i = i+1
if i % 1000 == 0:
print(k,i)
try:
data = {}
data['image_id'] = int(item['image_id'])
if data['image_id'] in s:
b = base64.decodestring(bytes(item['features'], encoding = "utf8"))
data['features'] = np.frombuffer(b, dtype=np.float32).reshape((36, -1))
in_data[data['image_id']] = data['features']
except:
print('error',item['image_id'])
np.save('split/coco_features.npy.train.' + str(k), in_data)