a-PyTorch-Tutorial-to-Image-Captioning
a-PyTorch-Tutorial-to-Image-Captioning copied to clipboard
Dimension error
encoder_dim = encoder_out.size(3) Dimension out of range (expected to be in range of [-2, 1], but got 3)
Sorry to bother you with something off topic. Have you been able to download the dataset from the links provided? I've been trying to download it for days but it doesn't work
encoder_dim = encoder_out.size(3) Dimension out of range (expected to be in range of [-2, 1], but got 3)
@yuhua666 , I am assuming you are not training in batches, this is why you are lacking that dimension. Your output features probably look like this:
encoder_dim = torch.rand(196, 2048) # (num_pixels, dimensionality)
encoder_dim.size(3) # This will result in your error.
[-2, 1] means you can only call sizes from -2 to 1. 0 and 1 would be the first two sizes, and -1 and -2 would also be those two, but taken in reverse order (E.g. encoder_dim.size(-1) = 2048; encoder_dim.size(-2) = 196).
What you need right there in your search function is an encoder like this:
encoder_dim = torch.rand(10, 14, 14, 2048) # (batch_size, width, height, dimensionality)
encoder_dim.size(3) # Now this will return your 2048 (number of filters)
If your GPU cannot hold multiple batches, you can just send it in one batch just to fill that dimension.