SWINUnetr: bug when padding input in PatchMerging layer
Description
I think there's an error when padding input for the patch merging layer in the Swin Transformer part of Swin Unetr.
More specifically, in swin_unetr.py, line 697, the input is padded with: x = F.pad(x, (0, 0, 0, d % 2, 0, w % 2, 0, h % 2))
However, I think w and h are mixed up here. So correct would be: x = F.pad(x, (0, 0, 0, d % 2, 0, h % 2, 0, w % 2)).
To Reproduce Run Swin Unetr with an input image where one dimension isn't dividable by 32.
Expected behavior I expect the uneven dimensions of x to be padded to be even afterwards.
Screenshots
We can see x_shape = [2, 11, 14, 11, 384]. In the screenshot, x is padded already, but to shape [2, 11, 15, 12, 384], which produces the error: RuntimeError:` Sizes of tensors must match except in dimension 4. Expected size 6 but got size 5 for tensor number 1 in the list.
Environment (please complete the following information): monai 0.8.1+271.g07de215c
Hi @tjades, thanks for the question and the issue. Can you use latsest MONAI release (>1.0.0) and see whether the problem still exists? I saw the latest code shoud have solved the uneven dimension issue. Let us know if there are still problems. We can further investigate the error.
Thank you.
Hello, I motice that in PatchMerging class, x2 and x5 have exactly the same definition,
x2 = x[:, 0::2, 1::2, 0::2, :]
x5 = x[:, 0::2, 1::2, 0::2, :]
To my understanding, either x2 or x5 should be x[:, 1::2, 1::2, 0::2, :], so that x0 ~ x7 would be 8 different vectors.
Is this a bug? or x2 and x5 are intended to be the same?
that's updated in monai >=0.9.1 with an option https://github.com/Project-MONAI/MONAI/blob/356d2d2f41b473f588899d705bbc682308cee52c/monai/networks/nets/swin_unetr.py#L81-L83
but in the PatchMerging class, I still see x2 and x5 are the same. So I'm supposed to use other options such as mergingv2, am I right?