PaddleSeg vision transformer中输入图像的维度问题 [General Issue]

vision transformer中输入图像的维度问题 [General Issue]

Open a123b12cd opened this issue 3 years ago • 5 comments

paddleseg2.5 paddlePaddle 2.1.0 Linux Python3.7 cuda11.2

用VisionTransformer作为backbone，输入的图像宽高必须相等吗？

Jun 14 '22 09:06 a123b12cd

不需要哈

Jun 15 '22 02:06 wuyefeilin

不需要哈

那应该怎么配置 img_size 直接写数组报错，如果裁剪就影响效果了微信截图_20220615105412

Jun 15 '22 02:06 a123b12cd

img_size主要是跟pos-embed有关，这个参数只能是整数。在forward的时候对x的shape是不做要求的。

Jun 15 '22 06:06 wuyefeilin

可以自行修改对应代码，使其可以传入数组，一般需要为patch_size的整数倍。

Jun 15 '22 07:06 shiyutang

看代码差不多明白了，谢谢

Jun 15 '22 10:06 a123b12cd

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

Dec 09 '22 17:12 github-actions[bot]

PaddleSeg PaddleSeg copied to clipboard

vision transformer中输入图像的维度问题 [General Issue]

PaddleSeg
PaddleSeg copied to clipboard