a-PyTorch-Tutorial-to-Image-Captioning icon indicating copy to clipboard operation
a-PyTorch-Tutorial-to-Image-Captioning copied to clipboard

Dimension error

Open yuhua666 opened this issue 2 years ago • 2 comments

encoder_dim = encoder_out.size(3) Dimension out of range (expected to be in range of [-2, 1], but got 3)

yuhua666 avatar Dec 08 '22 12:12 yuhua666

Sorry to bother you with something off topic. Have you been able to download the dataset from the links provided? I've been trying to download it for days but it doesn't work

Leo-Thomas avatar Dec 27 '22 01:12 Leo-Thomas

encoder_dim = encoder_out.size(3) Dimension out of range (expected to be in range of [-2, 1], but got 3)

@yuhua666 , I am assuming you are not training in batches, this is why you are lacking that dimension. Your output features probably look like this:

encoder_dim = torch.rand(196, 2048)  # (num_pixels, dimensionality)
encoder_dim.size(3)  # This will result in your error. 

[-2, 1] means you can only call sizes from -2 to 1. 0 and 1 would be the first two sizes, and -1 and -2 would also be those two, but taken in reverse order (E.g. encoder_dim.size(-1) = 2048; encoder_dim.size(-2) = 196).

What you need right there in your search function is an encoder like this:

encoder_dim = torch.rand(10, 14, 14, 2048)  # (batch_size,  width, height, dimensionality)
encoder_dim.size(3)  # Now this will return your 2048 (number of filters)

If your GPU cannot hold multiple batches, you can just send it in one batch just to fill that dimension.

AndreiMoraru123 avatar Jan 21 '23 19:01 AndreiMoraru123