Semi-Supervised-Image-Captioning
Semi-Supervised-Image-Captioning copied to clipboard
TypeError
When I run the "python train.py --saveto commoncraw_pretrained --dataset commoncrawl --cutoff 15", the got the following error:
Traceback (most recent call last):
File "train.py", line 341, in
How I solve it? Thanks!
I will test and get back to you as soon as possible.
Thanks! Looking forward to your reply.
I already found the problem, I fixed it by adding the following line.
y = numpy.zeros((len(feat_list), options['cutoff'], options['semantic_dim']), dtype='float32')
for idx, ff in enumerate(feat_list):
y[idx] = ff.reshape((-1, options['semantic_dim']))
This is the semantic features described in the paper, should be 3-dimension, (batch_size x cutoff x semantic_dimension). Where we use cutoff = 15, that is to say we use 15 detected words as input, semantic_dimension=300, that saying we use GloVe feature with 300 dimension.
Thanks for testing the code (I can't test it since I can't access the cluster at the moment), feel free if you have any further questions, I would be able to help you.
Thanks, I'll try again!
When I run the "python train.py --saveto commoncraw_pretrained --dataset commoncrawl --cutoff 15", the code beginning is good, not after 4379 steps:
Epoch 0, Updates: 4378, Cost is: 42.459641
Epoch 0, Updates: 4379, Cost is: 42.928699
Traceback (most recent call last):
File "train.py", line 341, in
I already fixed the bug by changing the line in generate_caps.py
sample, score, alpha = gen_sample(f_init[0], f_next[0], ctx_cutoff, cnn_feats[0],
When f_init is a list, the ensemble decoding will be automatically aroused. Besides, I also uploaded the best model "coco_bleu_best.zip", you can unzip it to get a pkl and npz, with that you can easily call generate_caps.py to reproduce the results reported in the paper.
Thanks for your testing, feel free to ask further questions.
Thank you for your help! I can run the code, but I have a question: why the computing CIDEr score is too hight(CIDEr: 3.819)? The best CIDEr score on the Microsoft COCO Image Captioning Challenge only 1.146.
Did you use the ms-coco dataset for testing (not commoncrawl)? Notice that my split could be different from your dataset split. Please refer to "ETHZ-Bootstrapped-Captioning/Data/coco/", there are files named "caption-train/val/test.json", 5000/5000 are used for val/test while the rest for training, the split strategy comes from Karpathy's github, you should verify that the training data doesn't contain your validation data. Otherwise, you might need to resplit your dataset.
o, I know!Thanks!