Channel shift transformation commits overflow if images are uint8
Greetings,
An unsuspecting user can easily commit overflow when using ImageDataGenerator with channel_shift_range if he uses uint8 images, as is typical the case.
You can easily see that is the case just looking at the code:
From affine_transformation.py,
np.clip(x_channel + intensity,
min_x,
max_x)
I think at the very least the documentation should say something about this. The documentation on this transformation should be more descriptive anyway. Something like:
This transformation adds a different random constant to each channel of the image. Beware of overflows when using
dtype=uint8.
Some of the other transformations change the type of the input. Personally, I think the more consistent solution would be to just do that.
Reproducing the bug
from keras import datasets
from keras.preprocessing.image import ImageDataGenerator
import matplotlib.pyplot as plt
(X, _), _ = datasets.mnist.load_data()
x = X[0, :, :, None]
t = {'channel_shift_intensity': 150}
gen = ImageDataGenerator()
y = gen.apply_transform(x, t)
z = gen.apply_transform(x.astype(int), t)
plt.subplot(1, 3, 1)
plt.imshow(x[:, :, 0], cmap='gray')
plt.title('Original')
plt.subplot(1, 3, 2)
plt.imshow(y[:, :, 0], cmap='gray')
plt.title('Channel Shift w/uint8')
plt.subplot(1, 3, 3)
plt.imshow(z[:, :, 0], cmap='gray')
plt.title('Channel Shift w/int32')
plt.show()

Clipping values to avoid overflow comes at some costs. See this SO answer. So this is design decision to optimize for speed/memory or convenience. As this is a python library I would vote for some additional check to avoid the overflow.
No need to add code complexity. Just write in the documentation "avoid using uint8 data when using channel_shift_range". Maybe also do a warning when the user uses uint8 data. Something to avoid the unsuspecting user from shooting himself in the foot.
the unsuspecting user won't read the documentation first.
I cannot speak of other people, but saying that in the document would be enough for me. But add a warning then. Or just convert the type when it doesn't make sense. That's what packages like scikit-image do. I would not make the code slower just because some people might use uint8.