djl icon indicating copy to clipboard operation
djl copied to clipboard

BatchNorm beta diminishing parameter values.

Open aksrajvanshi opened this issue 4 years ago • 3 comments

@stu1130

I am running the BatchNorm section, and I made gamma and beta as the two parameters in the section. Although the results turn out to be fine, the beta is turning out with really low values.

This is the book's result:

Screen Shot 2020-07-14 at 2 27 25 PM

And this is mine:

Screen Shot 2020-07-14 at 2 26 54 PM

aksrajvanshi avatar Jul 14 '20 18:07 aksrajvanshi

I can reproduce the issue and found both PyTorch and MXNet have the same problem. This could be caused by gradient vanishing. Only the first batchNorm have the issue. need to dive deeper

stu1130 avatar Jul 17 '20 17:07 stu1130

Reproducible use general batchnorm implementation. Investigating now

lanking520 avatar Sep 17 '20 18:09 lanking520

@lanking520 do we have any update?

goswamig avatar Mar 15 '21 21:03 goswamig