benchmarks
benchmarks copied to clipboard
Weight decay of biases
Hi, is it intended to perform weight decay on biases as it seems to be the case of the code and it seems that performing weight decay on biases will decrease performance of the models. Will this issue be fixed?
I do not think this was done intentionally because the official models do not perform weight decay on biases. Before fixing, we need to ensure this does not affect resnet50 convergence. /CC @bignamehyp to do this.