DenseNet
DenseNet copied to clipboard
Why is composite function BN-ReLU-Conv3x3 ?
Hello,
The composite function of other models is Conv3x3-BN-ReLU. Why is DenseNet special?
Looking forward to your answer. Thanks
Hi. This is following the preactivation design in the second ResNet paper. https://arxiv.org/abs/1603.05027
The essential difference here is that there are different scaling parameters in the BN layer in each BN-ReLU-Conv3x3. If we use BN after ReLU, every subsequent layer will be based on the same BN scaling parameters.