DeepRec
DeepRec copied to clipboard
Masknet replace batchnorm with layernorm
The paper from masknet uses layernorm. however the code implementation uses batchnorm.
Hi Marc,
Thanks for bringing this up! This is indeed a bug, and we are fixing it.
Hi Marc,
Upon checking, this is not a bug. When applying BatchNorm on the default axis (last dim), BatchNorm reduces to LayerNorm, and since the size of gamma/beta depends on the shape of input tensor, the original implementation is still correct.
However, for the clarity of the code, we updated the example (ref PR #816 ).
Thanks for the comment!
I am not sure I am following see this screenshot.
What am I missing?
Because your code isn't in trianing.
tf.layers.batch_normalization()
will call to class BatchNormalizationBase
https://github.com/DeepRec-AI/DeepRec/blob/6bd822e4d05c6b2a005e58342c7661c387b417cb/tensorflow/python/keras/layers/normalization.py#L43
tf.keras.layers.LayerNormalization()
will call to class LayerNormalization
https://github.com/DeepRec-AI/DeepRec/blob/6bd822e4d05c6b2a005e58342c7661c387b417cb/tensorflow/python/keras/layers/normalization.py#L898
In LayerNormalization, mean and var are computed by nn.moments
https://github.com/DeepRec-AI/DeepRec/blob/6bd822e4d05c6b2a005e58342c7661c387b417cb/tensorflow/python/keras/layers/normalization.py#L1025
then use nn.batch_normalization
to get the result.
https://github.com/DeepRec-AI/DeepRec/blob/6bd822e4d05c6b2a005e58342c7661c387b417cb/tensorflow/python/keras/layers/normalization.py#L1040-L1046
It is the same with BN without other features. https://github.com/DeepRec-AI/DeepRec/blob/6bd822e4d05c6b2a005e58342c7661c387b417cb/tensorflow/python/keras/layers/normalization.py#L643-L652 https://github.com/DeepRec-AI/DeepRec/blob/6bd822e4d05c6b2a005e58342c7661c387b417cb/tensorflow/python/keras/layers/normalization.py#L736-L739 https://github.com/DeepRec-AI/DeepRec/blob/6bd822e4d05c6b2a005e58342c7661c387b417cb/tensorflow/python/keras/layers/normalization.py#L820-L825
But the difference is that when you are not in training, the mean and var of BN will be replaced. https://github.com/DeepRec-AI/DeepRec/blob/6bd822e4d05c6b2a005e58342c7661c387b417cb/tensorflow/python/keras/layers/normalization.py#L744-L750
you can add input param moving_mean_initializer='ones'
which is defaulted to 'zeros' and find output is changed.
Thanks @Duyi-Wang it makes sense. I was confused by it as well but the doc clearly state it. Thanks for pointing out the code.
Adding a screenshot for posterity.
Feel free to close this one.