TransUNet icon indicating copy to clipboard operation
TransUNet copied to clipboard

is there some thing wrong?

Open lvzhengyi0204 opened this issue 4 years ago • 4 comments

I am looking at your code and found some problems

https://github.com/Beckschen/TransUNet/blob/main/networks/vit_seg_modeling.py#L133

patch_size = (img_size[0] // 16 // grid_size[0], img_size[1] // 16 // grid_size[1])

The training image size is 224

then patch_size = (224 //16// 16, 224 //16 //16) = (0,0)???

why???

lvzhengyi0204 avatar Feb 21 '21 14:02 lvzhengyi0204

Hello,

Many thanks for your questions. The patch size you calculated is (1, 1) in feature grids, representing (16, 16) in image level since the image is downsampled 16x through resnet. Let me know if you have any questions.

Beckschen avatar Feb 22 '21 05:02 Beckschen

I have the same problem when training the data, the patch_size equals (0,0) after the code 'patch_size = (img_size[0] // 16 // grid_size[0], img_size[1] // 16 // grid_size[1])', I don't understand the reply from Beckschen, could you help to explain a little bit more?

ghost avatar Mar 08 '21 03:03 ghost

I think the grid_size hyperparameter in config should be (14, 14) instead of (16, 16), as the paper clearly indicates that the img_size should be (224, 224) for sure.

Can you confirm on this @Beckschen ?

shawnhan108 avatar Mar 16 '21 02:03 shawnhan108

I think the grid_size hyperparameter in config should be (14, 14) instead of (16, 16), as the paper clearly indicates that the img_size should be (224, 224) for sure.

Can you confirm on this @Beckschen ?

I think u are right.

zjxi avatar Sep 20 '21 08:09 zjxi