PaddleSeg icon indicating copy to clipboard operation
PaddleSeg copied to clipboard

vision transformer中输入图像的维度问题 [General Issue]

Open a123b12cd opened this issue 3 years ago • 5 comments

paddleseg2.5 paddlePaddle 2.1.0 Linux Python3.7 cuda11.2

用VisionTransformer作为backbone,输入的图像宽高必须相等吗?

1

a123b12cd avatar Jun 14 '22 09:06 a123b12cd

不需要哈

wuyefeilin avatar Jun 15 '22 02:06 wuyefeilin

不需要哈

那应该怎么配置 img_size 直接写数组报错,如果裁剪就影响效果了 微信截图_20220615105412

a123b12cd avatar Jun 15 '22 02:06 a123b12cd

img_size主要是跟pos-embed有关,这个参数只能是整数。 在forward的时候对x的shape是不做要求的。

wuyefeilin avatar Jun 15 '22 06:06 wuyefeilin

可以自行修改对应代码,使其可以传入数组,一般需要为patch_size的整数倍。

shiyutang avatar Jun 15 '22 07:06 shiyutang

看代码差不多明白了,谢谢

a123b12cd avatar Jun 15 '22 10:06 a123b12cd

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Dec 09 '22 17:12 github-actions[bot]