keras-cv Added CLAHE augmentation layer

Resolves #359.

Based on the implementation @isears, this PR is ready for review @LukeWood!

A Huge Thanks to @isears!!!

Sample augmentation on oxford_flowers102 dataset:

May 09 '22 14:05 adhadse

Seems to run out of memory when running on oxford_flowers102. I didn't get time to completely investigate it but you might want to look into it once.

This may help you reproduce this:

import matplotlib.pyplot as plt
import tensorflow as tf
import tensorflow_datasets as tfds

from keras_cv.layers import preprocessing

IMG_SIZE = (224, 224)
BATCH_SIZE = 64


def resize(image, label, num_classes=10):
    image = tf.image.resize(image, IMG_SIZE)
    label = tf.one_hot(label, num_classes)
    return image, label


def main():
    data, ds_info = tfds.load("oxford_flowers102", with_info=True, as_supervised=True)
    train_ds = data["train"]

    num_classes = ds_info.features["label"].num_classes

    train_ds = (
        train_ds.map(lambda x, y: resize(x, y, num_classes=num_classes))
        .shuffle(10 * BATCH_SIZE)
        .batch(BATCH_SIZE)
    )

    clahe = preprocessing.CLAHE([0, 255])

    train_ds = train_ds.map(
        lambda x, y: (clahe(x), y),
        num_parallel_calls=tf.data.AUTOTUNE,
    )

    for images, labels in train_ds.take(1):
        plt.figure(figsize=(8, 8))
        for i in range(9):
            plt.subplot(3, 3, i + 1)
            plt.imshow(images[i].numpy().astype("uint8"))
            plt.axis("off")
        plt.show()


if __name__ == "__main__":
    main()

This is the standard template for demo files.

May 09 '22 17:05 quantumalaviya

@adhadse Thank you for creating this PR! Great to see a histogram equalization scheme used in imaging in the TensorFlow ecosystem.

Can we provide some sample images in the PR?

May 10 '22 00:05 LukeWood

@quantumalaviya I'll look into the issue. I was also warned something about this memory issue on my 12 gigs system during testing. @LukeWood I'll add a colab notebook soon, addressing the above issue and with a sample augmentation image.

May 10 '22 00:05 adhadse

@quantumalaviya I think this is because computation is really expensive, even with interpolation. Just changing the BATCH_SIZE to a little smaller value (I changed it to 32), this seems to work pretty fine. Let me know if there are other issues. Link to Colab notebook (might be deleted in future)

May 10 '22 01:05 adhadse

@LukeWood I have now added a sample illustration image for CLAHE.

May 10 '22 04:05 adhadse

@quantumalaviya ... Just changing the BATCH_SIZE to a little smaller value (I changed it to 32), this seems to work pretty fine.

I think it would be impractical to use clahe in general for limited batch size. Optimization is needed here. Curious, does other implementation (scikit-learn, opencv) also get overflown?

cc @adhadse @isears

May 12 '22 19:05 innat

Other then memory issue, as It could be related, do we have the same impl issues here as in https://github.com/tensorflow/addons/pull/2362#issuecomment-767136266

May 12 '22 19:05 bhack

@bhack @innat Since the implementation is derived from the same PR, the same simple issue follows here. I think the memory requirements with implementations across Open-CV, Scikit-Learn, and this PR are all high.

Check Benchmark notebook

I wasn't able to build tfa from source and benchmark @isears CLAHE PR for memory benchmark. But anyway, the result showed high memory requirements.

Along with that, the performance of the impl in this PR is worst which needs to be improved. Which is already discussed previously by @bhack because of impl.

Lib	Memit	Timeit
OpenCV	peak memory: 1140.96 MiB, increment: 0.00 MiB	1000 loops, best of 5: 324 µs per loop
Scikit-Learn	peak memory: 1140.96 MiB, increment: 0.00 MiB	10 loops, best of 5: 25.6 ms per loop
This PR	peak memory: 1141.30 MiB, increment: 0.34 MiB	1 loop, best of 5: 358 ms per loop

I am using memory_profiler for benchmarking and it does not benchmark batching. How can I benchmark batch preprocessing or normal memory profiling like above is enough?

May 15 '22 09:05 adhadse

You could also try to trace/benchmark some specific area of the code:

https://www.tensorflow.org/api_docs/python/tf/profiler/experimental/Trace

May 15 '22 09:05 bhack

@adhadse would love to merge this once you can add the tests and address comments! Please let me know if you will be picking this back up!

Jun 11 '22 18:06 LukeWood

@adhadse let me know if you will be resolving the final comments, if @ianjjohnson or myself can carry this to the finish line!

Jul 13 '22 17:07 LukeWood

@LukeWood Anyone picking this up. If not would love to close this layer.

Aug 06 '22 15:08 MrinalTyagi

@LukeWood Anyone picking this up. If not would love to close this layer.

I think given the lack of activity on this PR, it would be reasonable for you to pick this up if you're interested @MrinalTyagi

Aug 15 '22 16:08 ianstenbit

Sure, just fork this PR @MrinalTyagi!

Aug 15 '22 18:08 LukeWood

Feel free to re-open if interested in running this to the finish line, but it doesn't seem that the contributor will be returning.

Oct 18 '22 15:10 LukeWood

keras-cv keras-cv copied to clipboard

Added CLAHE augmentation layer

keras-cv
keras-cv copied to clipboard