Input format confusion

Open PierrickLeroy opened this issue 1 year ago • 1 comments

Hello,

Thanks for your work! In the readme you mention the input shape to be 112x112x3 but the to_input() function outputs torch.Size([1, 3, 112, 112]) when given a PIL Image.

I assume torch.Size([1, 3, 112, 112]) is the correct input format since 112x112x3 throws an error when trying to use the model, is it correct ?

Jan 12 '25 13:01 PierrickLeroy

I am not an author but yes you are right. It does make sense since torch models take as input a tensor of shape (batch_size, data_shape), and in torch, image data are supposed to be of shape (nb_of_channels, height, width) resulting here in a final shape of (batch_size, nb_of_channels, height, width)

Jan 31 '25 09:01 afm215