wide-resnet.pytorch
wide-resnet.pytorch copied to clipboard
Stride is Wrong
Following from the model graph for wideresnet50 with depth 28 and widen_factor = 10, layer2.0.conv2 and layer3.0.conv2 have stride - stride=(2, 2)
. It should be layer2.0.conv1 and layer3.0.conv1 that have stride=(2, 2)
, while layer2.0.conv2 and layer3.0.conv2 should have stride stride=(1, 1)
.
Here is the model graph:
Resnet(
(model): Wide_ResNet(
(conv1): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(layer1): Sequential(
(0): wide_basic(
(bn1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU()
(conv1): Conv2d(16, 160, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(dropout): Dropout(p=0, inplace=False)
(bn2): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU()
(conv2): Conv2d(160, 160, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(shortcut): Sequential(
(0): Conv2d(16, 160, kernel_size=(1, 1), stride=(1, 1))
)
)
(1): wide_basic(
(bn1): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU()
(conv1): Conv2d(160, 160, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(dropout): Dropout(p=0, inplace=False)
(bn2): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU()
(conv2): Conv2d(160, 160, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(shortcut): Sequential()
)
(2): wide_basic(
(bn1): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU()
(conv1): Conv2d(160, 160, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(dropout): Dropout(p=0, inplace=False)
(bn2): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU()
(conv2): Conv2d(160, 160, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(shortcut): Sequential()
)
(3): wide_basic(
(bn1): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU()
(conv1): Conv2d(160, 160, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(dropout): Dropout(p=0, inplace=False)
(bn2): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU()
(conv2): Conv2d(160, 160, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(shortcut): Sequential()
)
)
(layer2): Sequential(
(0): wide_basic(
(bn1): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU()
(conv1): Conv2d(160, 320, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(dropout): Dropout(p=0, inplace=False)
(bn2): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU()
(conv2): Conv2d(320, 320, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(shortcut): Sequential(
(0): Conv2d(160, 320, kernel_size=(1, 1), stride=(2, 2))
)
)
(1): wide_basic(
(bn1): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU()
(conv1): Conv2d(320, 320, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(dropout): Dropout(p=0, inplace=False)
(bn2): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU()
(conv2): Conv2d(320, 320, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(shortcut): Sequential()
)
(2): wide_basic(
(bn1): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU()
(conv1): Conv2d(320, 320, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(dropout): Dropout(p=0, inplace=False)
(bn2): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU()
(conv2): Conv2d(320, 320, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(shortcut): Sequential()
)
(3): wide_basic(
(bn1): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU()
(conv1): Conv2d(320, 320, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(dropout): Dropout(p=0, inplace=False)
(bn2): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU()
(conv2): Conv2d(320, 320, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(shortcut): Sequential()
)
)
(layer3): Sequential(
(0): wide_basic(
(bn1): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU()
(conv1): Conv2d(320, 640, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(dropout): Dropout(p=0, inplace=False)
(bn2): BatchNorm2d(640, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU()
(conv2): Conv2d(640, 640, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(shortcut): Sequential(
(0): Conv2d(320, 640, kernel_size=(1, 1), stride=(2, 2))
)
)
(1): wide_basic(
(bn1): BatchNorm2d(640, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU()
(conv1): Conv2d(640, 640, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(dropout): Dropout(p=0, inplace=False)
(bn2): BatchNorm2d(640, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU()
(conv2): Conv2d(640, 640, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(shortcut): Sequential()
)
(2): wide_basic(
(bn1): BatchNorm2d(640, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU()
(conv1): Conv2d(640, 640, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(dropout): Dropout(p=0, inplace=False)
(bn2): BatchNorm2d(640, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU()
(conv2): Conv2d(640, 640, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(shortcut): Sequential()
)
(3): wide_basic(
(bn1): BatchNorm2d(640, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU()
(conv1): Conv2d(640, 640, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(dropout): Dropout(p=0, inplace=False)
(bn2): BatchNorm2d(640, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU()
(conv2): Conv2d(640, 640, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(shortcut): Sequential()
)
)
(bn1): BatchNorm2d(640, eps=1e-05, momentum=0.9, affine=True, track_running_stats=True)
(relu1): ReLU()
(linear): Linear(in_features=640, out_features=100, bias=True)
)
)
Yeah, agree with @KaleabTessera. BTW, this makes the net run a little slower.