a-PyTorch-Tutorial-to-Image-Captioning
a-PyTorch-Tutorial-to-Image-Captioning copied to clipboard
Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning
I tried to overfitting the model but it seems so hard to achieve I've minimize the data into 100 image with 1 captions per image and the lowest loss I...
I run the code on linux server using gpu. But cpu occupies too high.Is there anything wrong?
In the CaptionDataset's __get_item__() method, you devided the image tensor by 255. Is it for regularization or for something else? img = torch.FloatTensor(self.imgs[i // self.cpi] / 255.) Thanks!
hello, thanks for your nice code. Does the length of the vocabulary affect the final result? because,The length of the vocabulary in other people's work is different from yours on...
In line 94 in caption.py you use: `scores = F.log_softmax(scores, dim=1)` Could you explain the reason for `log_softmax` here? You did not use it in `forward()` method. More than that,...
Hey, I'm just wondering. Can I use the original caption from Flickr8k (Flickr8k.token.txt) and Flickr30k (results_20130124.token), instead of using caption from karpathy's split using this code? Thank you very much
Hi, Thx for your great tutorial with nice guide and code. After I read decoder's code, I found that you just use lstm's hidden states to compute the next word's...
Great tutorial, thanks! In the case of "Hard" attention, you mentioned in your tutorial that "it is not differentiable" so maybe this is why a new objective function `Ls` is...
I try to run the code there are some errors when to create the input files. I want to know list of requirement packages version to run the code.
incomplete_inds = [ind for ind, next_word in enumerate(next_word_inds) if next_word != word_map['']] incomplete_inds always is [0,1,2,3,4] . and then complete_inds = list(set(range(len(next_word_inds))) - set(incomplete_inds)) complete_inds is empty so complete_seqs is...