horserma

Results 1 comments of horserma

@LaoAnchor In the original Residual network paper, they use a batch normalization layer followed by a scale layer. Different from the original "batch normalization" paper, their norm layer only subtracts...