a-PyTorch-Tutorial-to-Image-Captioning issues

skimage warning

1

i tried to generate caption with caption.py. but, i met this warning and script was stopped. (img_cap_py3) volquelme@ubuntu:~/show_attend_and_tell_pytorch$ python caption.py --img='test_image/test1.jpg' --model='BEST_checkpoint_flickr8k_5_cap_per_img_5_min_word_freq.pth.tar' --word_map='Flickr8k_output/WORDMAP_flickr8k_5_cap_per_img_5_min_word_freq.json' --beam_size=5 /home/volquelme/anaconda3/envs/img_cap_py3/lib/python3.6/site-packages/skimage/transform/_warps.py:24: UserWarning: The default multichannel argument...

volquelme

How to do ensemble at testing with beam search?

1

Hi @sgrvinod, thanks for your code. When using beam search, how de we perform ensemble testing (testing multiple models and average predictions across models). Should we add all the log...

homelifes

What are saved to "TRAIN_IMAGES_flickr8k_5_cap_per_img_5_min_word_freq.hdf5" file?

1

When I run the create_input_files.py, it saved 7 json files and 3 hdf5 files. When I read, I understand the json files but I don't know what are the values...

phyukhaing7

Train on Windows

1

Hi, not a bug per se, but I couldn't train on windows 10 at first. Had to set dataloader workers to 0. And for ``` scores, *_ = pack_padded_sequence(scores, decode_lengths,...

michael20at

scores in Beam search should be multiplied or added?

1

Hi @sgrvinod , in the caption.py, line 97, `scores = top_k_scores.expand_as(scores) + scores # (s, vocab_size)` I wonder why it's added? Shouldn't it be multiplied?

Zachary-ZS

Why do you use sum(decode_lengths) to as the count to update losses and topkaccuracies?

3

Hi, I am wondering why you use the sum(decode_lengths), which to me, means the total number of tokens in the batch, as the counts to update the loss metrics? Isn't...

dwang68

can you offer a tutorial for video captioning with soft attention?

Thank you very much for your useful tutorial. So, could you kindly offer a a tutorial for video captioning with soft attention? For example, this paper, Describing videos by exploiting...

tuyunbin

Get error loading the model when switch resnet101 to resnet50

Hi, I wanted to switch different pretrained CNN models and see how they effect the final results. So i switched resnet101 to resnet50 in models.py and ran train.py. The model...

Ivanclj

Two optimizers?

https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Image-Captioning/blob/b0467042e3fec1ef72c323ccb41fb174a4f1ea52/train.py#L64 Why do you use two optimizers here? It seems other people only use one optimizer, which accepts both the encoder and decoder's params https://github.com/yunjey/pytorch-tutorial/blob/master/tutorials/03-advanced/image_captioning/train.py#L45 Thanks

johntiger1

Caption generation error

1

After training, when I use the model to generate captions. It starts giving me the below error: `File "caption.py", line 215, in seq, alphas = caption_image_beam_search(encoder, decoder, args.img, word_map, args.beam_size)...

nattari

a-PyTorch-Tutorial-to-Image-Captioning
a-PyTorch-Tutorial-to-Image-Captioning copied to clipboard

Metadata

skimage warning

How to do ensemble at testing with beam search?

What are saved to "TRAIN_IMAGES_flickr8k_5_cap_per_img_5_min_word_freq.hdf5" file?

Train on Windows

scores in Beam search should be multiplied or added?

Why do you use sum(decode_lengths) to as the count to update losses and topkaccuracies?

can you offer a tutorial for video captioning with soft attention?

Get error loading the model when switch resnet101 to resnet50

Two optimizers?

Caption generation error

← Metadata

Owner

Metadata

a-PyTorch-Tutorial-to-Image-Captioning a-PyTorch-Tutorial-to-Image-Captioning copied to clipboard

Metadata

← Metadata

Owner

Metadata

a-PyTorch-Tutorial-to-Image-Captioning
a-PyTorch-Tutorial-to-Image-Captioning copied to clipboard