research-contributions icon indicating copy to clipboard operation
research-contributions copied to clipboard

SWINUnetr: bug when padding input in PatchMerging layer

Open tjades opened this issue 3 years ago • 4 comments

Description I think there's an error when padding input for the patch merging layer in the Swin Transformer part of Swin Unetr. More specifically, in swin_unetr.py, line 697, the input is padded with: x = F.pad(x, (0, 0, 0, d % 2, 0, w % 2, 0, h % 2))

However, I think w and h are mixed up here. So correct would be: x = F.pad(x, (0, 0, 0, d % 2, 0, h % 2, 0, w % 2)).

To Reproduce Run Swin Unetr with an input image where one dimension isn't dividable by 32.

Expected behavior I expect the uneven dimensions of x to be padded to be even afterwards.

Screenshots image We can see x_shape = [2, 11, 14, 11, 384]. In the screenshot, x is padded already, but to shape [2, 11, 15, 12, 384], which produces the error: RuntimeError:` Sizes of tensors must match except in dimension 4. Expected size 6 but got size 5 for tensor number 1 in the list.

Environment (please complete the following information): monai 0.8.1+271.g07de215c

tjades avatar Dec 19 '22 14:12 tjades

Hi @tjades, thanks for the question and the issue. Can you use latsest MONAI release (>1.0.0) and see whether the problem still exists? I saw the latest code shoud have solved the uneven dimension issue. Let us know if there are still problems. We can further investigate the error.

Thank you.

tangy5 avatar Dec 19 '22 19:12 tangy5

Hello, I motice that in PatchMerging class, x2 and x5 have exactly the same definition, x2 = x[:, 0::2, 1::2, 0::2, :] x5 = x[:, 0::2, 1::2, 0::2, :] To my understanding, either x2 or x5 should be x[:, 1::2, 1::2, 0::2, :], so that x0 ~ x7 would be 8 different vectors. Is this a bug? or x2 and x5 are intended to be the same?

rongzhao-zhang avatar Mar 17 '23 13:03 rongzhao-zhang

that's updated in monai >=0.9.1 with an option https://github.com/Project-MONAI/MONAI/blob/356d2d2f41b473f588899d705bbc682308cee52c/monai/networks/nets/swin_unetr.py#L81-L83

wyli avatar Mar 17 '23 13:03 wyli

but in the PatchMerging class, I still see x2 and x5 are the same. So I'm supposed to use other options such as mergingv2, am I right?

rongzhao-zhang avatar Mar 18 '23 02:03 rongzhao-zhang