WaveMix How do you actually use this?

Calling model(img) on an arbitrary image resized to 256, and unsqueezed to give it the correct dimension (1,3,256,256) does not actually work. What else are you supposed to do to the image before giving it to the model for inference? Very frustrating.

Jul 13 '23 23:07 avaughan0

It should work even if the image is of an arbitrary dimension, as long as the edge dimensions are multiple of 32.

Jul 15 '23 11:07 pranavphoenix

Thanks for replying.

I was able to determine that my error was being caused by the tensor not being sent to GPU correctly (and it somehow ended up being channels last instead of channels first). My apologies. I've been having a tough week trying to implement code from papers!

I am specifically trying to use your Places365 pretrained model. Should I assume that your class labels are the same as what is listed here? https://github.com/CSAILVision/places365/blob/master/categories_places365.txt

Thanks again.

Jul 15 '23 14:07 avaughan0

OK, real problem, sorry for the doubletap: This made me think I was crazy, but using your pretrained Places365 weights, trying to do inference using the example given here, I get the exact same results every time no matter what input is given (I checked three times to make sure that these were different inputs after my preprocessing - and they were)

These are those results:

torch.return_types.topk( values=tensor([[2.0731, 1.9153, 1.7019, 1.5919, 1.5876]], device='cuda:0', grad_fn=<TopkBackward0>), indices=tensor([[ 12, 67, 270, 103, 317]], device='cuda:0'))

I even tried reloading the model into a new variable, but same thing.

Is the syntax prediction = model(img) incorrect? Or is something else going on?

Jul 15 '23 17:07 avaughan0

hello, I'm having the same problem as yours right now. Have you got a solution yet?

Aug 28 '23 06:08 aTunass

can you check with other pre-trained weights and see if the issues persist.

Jul 07 '24 15:07 pranavphoenix