keras-cv
keras-cv copied to clipboard
Added CLAHE augmentation layer
Resolves #359.
Based on the implementation @isears, this PR is ready for review @LukeWood!
A Huge Thanks to @isears!!!
Sample augmentation on oxford_flowers102
dataset:
On this Chest X-Ray dataset
Seems to run out of memory when running on oxford_flowers102
. I didn't get time to completely investigate it but you might want to look into it once.
This may help you reproduce this:
import matplotlib.pyplot as plt
import tensorflow as tf
import tensorflow_datasets as tfds
from keras_cv.layers import preprocessing
IMG_SIZE = (224, 224)
BATCH_SIZE = 64
def resize(image, label, num_classes=10):
image = tf.image.resize(image, IMG_SIZE)
label = tf.one_hot(label, num_classes)
return image, label
def main():
data, ds_info = tfds.load("oxford_flowers102", with_info=True, as_supervised=True)
train_ds = data["train"]
num_classes = ds_info.features["label"].num_classes
train_ds = (
train_ds.map(lambda x, y: resize(x, y, num_classes=num_classes))
.shuffle(10 * BATCH_SIZE)
.batch(BATCH_SIZE)
)
clahe = preprocessing.CLAHE([0, 255])
train_ds = train_ds.map(
lambda x, y: (clahe(x), y),
num_parallel_calls=tf.data.AUTOTUNE,
)
for images, labels in train_ds.take(1):
plt.figure(figsize=(8, 8))
for i in range(9):
plt.subplot(3, 3, i + 1)
plt.imshow(images[i].numpy().astype("uint8"))
plt.axis("off")
plt.show()
if __name__ == "__main__":
main()
This is the standard template for demo files.
@adhadse Thank you for creating this PR! Great to see a histogram equalization scheme used in imaging in the TensorFlow ecosystem.
Can we provide some sample images in the PR?
@quantumalaviya I'll look into the issue. I was also warned something about this memory issue on my 12 gigs system during testing. @LukeWood I'll add a colab notebook soon, addressing the above issue and with a sample augmentation image.
@quantumalaviya I think this is because computation is really expensive, even with interpolation. Just changing the BATCH_SIZE
to a little smaller value (I changed it to 32), this seems to work pretty fine. Let me know if there are other issues.
Link to Colab notebook (might be deleted in future)
@LukeWood I have now added a sample illustration image for CLAHE.
@quantumalaviya ... Just changing the
BATCH_SIZE
to a little smaller value (I changed it to 32), this seems to work pretty fine.
I think it would be impractical to use clahe
in general for limited batch size. Optimization is needed here. Curious, does other implementation (scikit-learn, opencv) also get overflown?
cc @adhadse @isears
Other then memory issue, as It could be related, do we have the same impl issues here as in https://github.com/tensorflow/addons/pull/2362#issuecomment-767136266
@bhack @innat Since the implementation is derived from the same PR, the same simple issue follows here. I think the memory requirements with implementations across Open-CV, Scikit-Learn, and this PR are all high.
Check Benchmark notebook
I wasn't able to build tfa from source and benchmark @isears CLAHE PR for memory benchmark. But anyway, the result showed high memory requirements.
Along with that, the performance of the impl in this PR is worst which needs to be improved. Which is already discussed previously by @bhack because of impl.
Lib | Memit | Timeit |
---|---|---|
OpenCV | peak memory: 1140.96 MiB, increment: 0.00 MiB | 1000 loops, best of 5: 324 µs per loop |
Scikit-Learn | peak memory: 1140.96 MiB, increment: 0.00 MiB | 10 loops, best of 5: 25.6 ms per loop |
This PR | peak memory: 1141.30 MiB, increment: 0.34 MiB | 1 loop, best of 5: 358 ms per loop |
I am using
memory_profiler
for benchmarking and it does not benchmark batching. How can I benchmark batch preprocessing or normal memory profiling like above is enough?
You could also try to trace/benchmark some specific area of the code:
https://www.tensorflow.org/api_docs/python/tf/profiler/experimental/Trace
@adhadse would love to merge this once you can add the tests and address comments! Please let me know if you will be picking this back up!
@adhadse let me know if you will be resolving the final comments, if @ianjjohnson or myself can carry this to the finish line!
@LukeWood Anyone picking this up. If not would love to close this layer.
@LukeWood Anyone picking this up. If not would love to close this layer.
I think given the lack of activity on this PR, it would be reasonable for you to pick this up if you're interested @MrinalTyagi
Sure, just fork this PR @MrinalTyagi!
Feel free to re-open if interested in running this to the finish line, but it doesn't seem that the contributor will be returning.