big_vision icon indicating copy to clipboard operation
big_vision copied to clipboard

Behavior of `solarize()` depends on integer overflow

Open EIFY opened this issue 8 months ago • 0 comments

I am not 100% sure about the intention but I do want to raise the alarm. The solarize() transform here

https://github.com/google-research/big_vision/blob/01edb81a4716f93a48be43b3a4af14e29cdb3a7f/big_vision/pp/autoaugment.py#L180-L184

inverts the pixel when its value is greater or equal to the threshold, so one would think that higher augmentation magnitude needs lower threshold. However, the threshold increases linearly with magnitude:

https://github.com/google-research/big_vision/blob/01edb81a4716f93a48be43b3a4af14e29cdb3a7f/big_vision/pp/autoaugment.py#L513

Counterintuitively, it still works as expected with magnitude=_MAX_LEVEL because of integer overflow. Given

t = tf.constant([[[0,0,0]]], dtype=tf.uint8)

t < i evaluates to tf.Tensor([[[False False False]]], shape=(1, 1, 3), dtype=bool) iff not (i % 256). In other words, magnitude=_MAX_LEVEL means int((level/_MAX_LEVEL) * 256) = 256, which is equivalent to 0 in tf.uint8. Given the following tf_gradient that goes from (0, 0, 0) to (255, 255, 255) in alternating directions

download (13)

Both solarize(tf_gradient, 256) and solarize(tf_gradient, 0) indeed fully invert the image:

download (14)

But if magnitude is 9, int((9/10) * 256) = 230, and solarize(tf_gradient, 230) "abruptly" only inverts a small portion of the image:

download (15)

EIFY avatar Jun 06 '24 04:06 EIFY