horserma
Results
1
comments of
horserma
@LaoAnchor In the original Residual network paper, they use a batch normalization layer followed by a scale layer. Different from the original "batch normalization" paper, their norm layer only subtracts...