keras BatchNormalization gives incorrect output with masked inputs

BatchNormalization gives incorrect output with masked inputs > 3 dimensions

Open drasmuss opened this issue 1 year ago • 3 comments

The mean/variance calculations are incorrect, which means the inputs are not normalized correctly. E.g.

import keras

x = keras.ops.ones((1, 2, 3, 4))
x._keras_mask = keras.ops.ones((1, 2, 1))

y = keras.layers.BatchNormalization()(x, training=True)

print(keras.ops.mean(y, axis=-1))

gives output

tf.Tensor([-0.57732624 -0.57732624 -0.57732624 -0.57732624], shape=(4,), dtype=float32)

instead of the correct normalized output ([0, 0, 0, 0]).

The basic issue is that this calculation is incorrect: https://github.com/keras-team/keras/blob/efaaf85e19113400f23462cbafcef433cd95ad9c/keras/src/layers/normalization/batch_normalization.py#L310-L314 because it doesn't account for the broadcasting (i.e. it gives a value of 2 in the above example, when it should be 2 * 3 * 4).

See https://github.com/keras-team/keras/issues/19818 for more discussion/background.

Jun 12 '24 23:06 drasmuss

I think a better workaround is to validate the shape of the mask in keras.

Jun 13 '24 13:06 Grvzard

The shape of the mask is correct in this example (according to https://github.com/keras-team/keras/issues/19818#issuecomment-2156142266), so validation wouldn't help in this case.

Jun 13 '24 13:06 drasmuss

because it doesn't account for the broadcasting (i.e. it gives a value of 2 in the above example, when it should be 2 * 3 * 4).

broadcasting from (2,) to (2, 3, 4) makes sense here, but elsewhere, "broadcasting" may starts with the rightmost dimension, i.e. broadcast (4,) to (2, 3, 4)

Jun 13 '24 15:06 Grvzard

Are you satisfied with the resolution of your issue? Yes No

Jan 28 '25 04:01 google-ml-butler[bot]

keras keras copied to clipboard

BatchNormalization gives incorrect output with masked inputs > 3 dimensions

keras
keras copied to clipboard