sagan-pytorch
sagan-pytorch copied to clipboard
Why so many ConvBlock(512, 512 ???
self.conv = nn.ModuleList([ConvBlock(512, 512, n_class=n_class),
ConvBlock(512, 512, n_class=n_class),
ConvBlock(512, 512, n_class=n_class,
self_attention=True),
ConvBlock(512, 256, n_class=n_class),
ConvBlock(256, 128, n_class=n_class)])
Deeper & wider network gave better results. As conv block has only 1 conv module, network is not very deep.
It's even better to use more ConvBlocks, but fewer filters. Deeper nets tend to learn better features. I would say the current model architecture is too wide, which leads to many "dead" kernels