big_vision
big_vision copied to clipboard
Behavior of `solarize()` depends on integer overflow
I am not 100% sure about the intention but I do want to raise the alarm. The solarize()
transform here
https://github.com/google-research/big_vision/blob/01edb81a4716f93a48be43b3a4af14e29cdb3a7f/big_vision/pp/autoaugment.py#L180-L184
inverts the pixel when its value is greater or equal to the threshold, so one would think that higher augmentation magnitude needs lower threshold. However, the threshold increases linearly with magnitude:
https://github.com/google-research/big_vision/blob/01edb81a4716f93a48be43b3a4af14e29cdb3a7f/big_vision/pp/autoaugment.py#L513
Counterintuitively, it still works as expected with magnitude=_MAX_LEVEL
because of integer overflow. Given
t = tf.constant([[[0,0,0]]], dtype=tf.uint8)
t < i
evaluates to tf.Tensor([[[False False False]]], shape=(1, 1, 3), dtype=bool)
iff not (i % 256)
. In other words, magnitude=_MAX_LEVEL
means int((level/_MAX_LEVEL) * 256) = 256
, which is equivalent to 0 in tf.uint8
. Given the following tf_gradient
that goes from (0, 0, 0)
to (255, 255, 255)
in alternating directions
Both solarize(tf_gradient, 256)
and solarize(tf_gradient, 0)
indeed fully invert the image:
But if magnitude is 9
, int((9/10) * 256) = 230
, and solarize(tf_gradient, 230)
"abruptly" only inverts a small portion of the image: