keras-cv icon indicating copy to clipboard operation
keras-cv copied to clipboard

`tf.vectorized_map` fallback cause

Open bhack opened this issue 3 years ago • 28 comments

This is the current list of fallback we have on master with the related cause for our KPLs ~(it needs tensorflow/tensorflow#55562 or tf-nightly when merged)~:

pytest -rP keras_cv/layers/preprocessing/ | grep while_loop | sort | uniq

WARNING  tensorflow:pfor.py:1082 Using a while_loop for converting AdjustHue cause Input "delta" of op 'AdjustHue' expected to be loop invariant.
WARNING  tensorflow:pfor.py:1082 Using a while_loop for converting AdjustSaturation cause Input "scale" of op 'AdjustSaturation' expected to be loop invariant.
WARNING  tensorflow:pfor.py:1082 Using a while_loop for converting DepthwiseConv2dNative cause Input "filter" of op 'DepthwiseConv2dNative' expected to be not loop invariant.
WARNING  tensorflow:pfor.py:1082 Using a while_loop for converting DynamicPartition cause there is no register converter for this op.
WARNING  tensorflow:pfor.py:1082 Using a while_loop for converting DynamicStitch cause there is no register converter for this op.
WARNING  tensorflow:pfor.py:1082 Using a while_loop for converting HistogramFixedWidth cause there is no register converter for this op.
WARNING  tensorflow:pfor.py:1082 Using a while_loop for converting ImageProjectiveTransformV3 cause there is no register converter for this op.
WARNING  tensorflow:pfor.py:1082 Using a while_loop for converting RandomShuffle cause there is no register converter for this op.
WARNING  tensorflow:pfor.py:1082 Using a while_loop for converting StridedSlice cause Input "input" of op 'StridedSlice' expected to be not loop invariant.

You can find exactly which test is involed running: pytest -rP keras_cv/layers/preprocessing/

Some of these could be fixed with a refactor in our code, others need a contribution (trivial or not) for a registered converter (e.g. See https://github.com/keras-team/keras-cv/issues/259#issuecomment-1094140155)

I think that we could care about this:

  • monitor how many fallback we are accumulating directly in the CI, with a Github Action, so that we could check this at the PR stage
  • contribute the (trivial?) converters that we care about in https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/parallel_for/pfor.py#L920-L997
  • (extra) extend the same policy to jit_compile when we have our compile API ready

bhack avatar Apr 10 '22 13:04 bhack

/cc @wangpengmit @rohan100jain

bhack avatar Apr 10 '22 13:04 bhack

Just for reference I copy-paste the current Pfor heuristic:

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/parallel_for/pfor.py#L3372-L3389

  The conversion process uses a simple greedy heuristic. It walks the loop body
  and tries to express the functionality of running each node in a loop with a
  new set of nodes. When converting an op several cases are possible:
  - The op is not inside the loop body. Hence it can be used as is.
  - The op does not depend on the iteration number and is stateless. In this
    case, it can be used as is.
  - The op is not stateful, and depends on iteration number only through control
    dependencies. In this case, we can create a single op with same inputs and
    attributes, but with "converted" control dependencies.
  - The op is not stateful, and all its inputs are loop invariant. In this
    case, similar to above, we can create a single op with same inputs and
    attributes, but with "converted" control dependencies.
  - The op is stateful or at least one of the inputs is not loop invariant. In
    this case, we run the registered converter for that op to create a set of
    converted ops. All nodes in the set will have converted control dependencies
    corresponding to control dependencies of the original op. If the op returned
    multiple outputs, "converted outputs" could be produced by different ops in
    this set.

bhack avatar Apr 11 '22 14:04 bhack

I want to add an example picking the first fallback in the list that has already a registered converter:

WARNING  tensorflow:pfor.py:1082 Using a while_loop for converting AdjustHue cause Input "delta" of op 'AdjustHue' expected to be loop invariant.

This is the converter:

@RegisterPFor("AdjustHue")
def _convert_adjust_hue(pfor_input):
  images = pfor_input.stacked_input(0)
  delta = pfor_input.unstacked_input(1)
  return wrap(gen_image_ops.adjust_hue(images, delta), True)

So we cannot supersed the limit of the original design of tf.image.adjust_hue where it is already vectorized on the batch dimension but not on the delta arg as it is just float value:

def adjust_hue(image, delta, name=None):
  ....
  ....
  Args:
    image: RGB image or images. The size of the last dimension must be 3.
    delta: float.  How much to add to the hue channel.

IMHO this is going to point again to the original topic related to what kind of empiric evidence we are going to sustain within batch randomization, with the current native ops set we have in TF, vs other alternative solutions like https://github.com/keras-team/keras-cv/pull/146#issuecomment-1048315498 at least as a workaround to not generate a random augmentation for each sample in the batch for each op.

bhack avatar Apr 11 '22 15:04 bhack

As tf.image.* is also currently MIA/orphan I don't think that we could extend the API and kernels to add the non scalar support of delta, scale etc.. params

bhack avatar Apr 13 '22 15:04 bhack

Do you have talked about this internally between teams? Any action item?

bhack avatar Apr 15 '22 19:04 bhack

We (the TF core team) don't monitor keras-team/keras-cv . The issue needs to be reposted to tensorflow/tensorflow to be under our radar.

wangpengmit avatar Apr 15 '22 20:04 wangpengmit

@wangpengmit I understand but this is really a boundary topic. What kind of ticket do you want for TF (core)? Just for the missing converters?

bhack avatar Apr 15 '22 20:04 bhack

It can be just a title and a link to this one. TF has a system for alerting TF developers of tensorflow/tensorflow issues.

wangpengmit avatar Apr 15 '22 21:04 wangpengmit

Ok let see what we can do in TF at https://github.com/tensorflow/tensorflow/issues/55639 and what we need to change in Keras-cv

bhack avatar Apr 15 '22 21:04 bhack

I update the fallback list on the current master https://github.com/keras-team/keras-cv/commit/b54292159c3af5239a150aede34fa07a4051d35f:

WARNING  tensorflow:pfor.py:1082 Using a while_loop for converting AdjustContrastv2 cause Input "contrast_factor" of op 'AdjustContrastv2' expected to be loop invariant.
WARNING  tensorflow:pfor.py:1082 Using a while_loop for converting AdjustHue cause Input "delta" of op 'AdjustHue' expected to be loop invariant.
WARNING  tensorflow:pfor.py:1082 Using a while_loop for converting AdjustSaturation cause Input "scale" of op 'AdjustSaturation' expected to be loop invariant.
WARNING  tensorflow:pfor.py:1082 Using a while_loop for converting Bitcast cause there is no registered converter for this op.
WARNING  tensorflow:pfor.py:1082 Using a while_loop for converting DepthwiseConv2dNative cause Input "filter" of op 'DepthwiseConv2dNative' expected to be not loop invariant.
WARNING  tensorflow:pfor.py:1082 Using a while_loop for converting DynamicPartition cause there is no registered converter for this op.
WARNING  tensorflow:pfor.py:1082 Using a while_loop for converting DynamicStitch cause there is no registered converter for this op.
WARNING  tensorflow:pfor.py:1082 Using a while_loop for converting HistogramFixedWidth cause there is no registered converter for this op.
WARNING  tensorflow:pfor.py:1082 Using a while_loop for converting ImageProjectiveTransformV3 cause there is no registered converter for this op.
WARNING  tensorflow:pfor.py:1082 Using a while_loop for converting RandomShuffle cause there is no registered converter for this op.
WARNING  tensorflow:pfor.py:1082 Using a while_loop for converting RngReadAndSkip cause there is no registered converter for this op.
WARNING  tensorflow:pfor.py:1082 Using a while_loop for converting StatelessRandomGetKeyCounter cause there is no registered converter for this op.
WARNING  tensorflow:pfor.py:1082 Using a while_loop for converting StatelessRandomUniformFullIntV2 cause there is no registered converter for this op.
WARNING  tensorflow:pfor.py:1082 Using a while_loop for converting StatelessRandomUniformV2 cause there is no registered converter for this op.
WARNING  tensorflow:pfor.py:1082 Using a while_loop for converting StridedSlice cause Input "input" of op 'StridedSlice' expected to be not loop invariant.

bhack avatar Apr 28 '22 13:04 bhack

@qlzh727 Just an extra note, as you have already seen in https://github.com/tensorflow/tensorflow/issues/56242

We need to extend this analysis also to the Keras repo for: https://github.com/keras-team/keras/blob/master/keras/layers/preprocessing/image_preprocessing.py

bhack avatar May 24 '22 23:05 bhack

Ack. Will take a look later this week.

qlzh727 avatar May 25 '22 18:05 qlzh727

@qlzh727 Just an extra note, as you have already seen in tensorflow/tensorflow#56242

We need to extend this analysis also to the Keras repo for: https://github.com/keras-team/keras/blob/master/keras/layers/preprocessing/image_preprocessing.py

We could write a script to iterate the symbols in the keras_cv.layers.preprocessing namespace. That should cover all KPLs from keras core and keras cv as we alias the core ones there

LukeWood avatar May 25 '22 22:05 LukeWood

We could write a script to iterate the symbols in the keras_cv.layers.preprocessing namespace. That should cover all KPLs from keras core and keras cv as we alias the core ones there

Yes, I used some quick and dirty bash command locally to produce these "reports". A new CI Action it would be ideal so that we have also a detector for new introduced fallbacks in candidate PRs.

(extra) extend the same policy to jit_compile when we have https://github.com/keras-team/keras-cv/issues/165#issuecomment-1083502165

This also would be nice to have later when we have the mentioned API in Keras master.

bhack avatar May 25 '22 23:05 bhack

CCing @ishark .

wangpengmit avatar Jul 25 '22 21:07 wangpengmit

Does this problem still exist?

Mirandatz avatar Aug 29 '22 18:08 Mirandatz

Does this problem still exist?

Yes, this is the updated list today only for the preprocessing layers

pytest -rP keras_cv/layers/preprocessing/ | grep while_loop | sort | uniq
WARNING  tensorflow:pfor.py:1081 Using a while_loop for converting AdjustContrastv2
WARNING  tensorflow:pfor.py:1081 Using a while_loop for converting AdjustHue
WARNING  tensorflow:pfor.py:1081 Using a while_loop for converting AdjustSaturation
WARNING  tensorflow:pfor.py:1081 Using a while_loop for converting Bitcast
WARNING  tensorflow:pfor.py:1081 Using a while_loop for converting CropAndResize
WARNING  tensorflow:pfor.py:1081 Using a while_loop for converting DecodeJpeg
WARNING  tensorflow:pfor.py:1081 Using a while_loop for converting DepthwiseConv2dNative
WARNING  tensorflow:pfor.py:1081 Using a while_loop for converting EncodeJpegVariableQuality
WARNING  tensorflow:pfor.py:1081 Using a while_loop for converting HistogramFixedWidth
WARNING  tensorflow:pfor.py:1081 Using a while_loop for converting ImageProjectiveTransformV3
WARNING  tensorflow:pfor.py:1081 Using a while_loop for converting RandomShuffle
WARNING  tensorflow:pfor.py:1081 Using a while_loop for converting RngReadAndSkip
WARNING  tensorflow:pfor.py:1081 Using a while_loop for converting StatelessRandomGetKeyCounter
WARNING  tensorflow:pfor.py:1081 Using a while_loop for converting StatelessRandomUniformFullIntV2
WARNING  tensorflow:pfor.py:1081 Using a while_loop for converting StatelessRandomUniformV2

bhack avatar Aug 29 '22 18:08 bhack

@bhack Thank you for the quick reply! Saved me a lot of container-rebuilding time.

Also: :(

Mirandatz avatar Aug 29 '22 18:08 Mirandatz

@wangpengmit Can I ask you why you have mentioned @ishark 1 month ago? Is there a new API ownership plan?

bhack avatar Aug 29 '22 19:08 bhack

Hi all,

Thank you for your work on this issue. I was wondering if there is a recommended workaround (besides using TF 2.8), or if there is an estimate for when this may be fixed for the most commonly used augmentation layers. While my augmentation pipeline doesn't directly rely on any of the items listed in https://github.com/keras-team/keras-cv/issues/291#issuecomment-1230707124, some of the tf.keras.layers.RandomX layers rely on Bitcast, StatelessRandomUniformV2, RngReadAndSkip, and ImageProjectiveTransformV3. I'd be willing to help contribute a fix, but I'm not familiar with this part of the codebase or what the vectorization process typically looks like.

I've tried both TF Nightly and TF 2.9, and my time per epoch increases from 21 seconds on TF 2.8 to 9 minutes, a 25x slowdown. Removing all augmentations from my model restores the expected training time. This is a similar issue to https://github.com/keras-team/keras-cv/issues/581, though perhaps worse since I am using a few different augmentations.

If this is going to take a while to fix, would it make sense to roll back https://github.com/keras-team/keras/commit/dbc4526978865b372e794accbaa7c584f9c86a0f in the meantime? I don't see an indication in the documentation that these layers are effectively unusable at the moment if you want acceptable training speed, and it looks like nightly has already been broken for a while. Thank you for your time.

TylerADavis avatar Aug 30 '22 21:08 TylerADavis

While my augmentation pipeline doesn't directly rely on any of the items listed in https://github.com/keras-team/keras-cv/issues/291#issuecomment-1230707124, some of the tf.keras.layers.RandomX layers rely on Bitcast, StatelessRandomUniformV2, RngReadAndSkip, and ImageProjectiveTransformV3

My list is just covering the "failing" TF ops that was used in the Keras-cv preprocessing layers implementation so it is not a list of the involved layers.

bhack avatar Aug 30 '22 22:08 bhack

I'd be willing to help contribute a fix, but I'm not familiar with this part of the codebase or what the vectorization process typically looks like.

As we have a list of open points on this topic to clarify since the early days of this repo it is not so easy to identify a clear action/item contribution path.

Please check my list at: https://github.com/keras-team/keras-cv/issues/581#issuecomment-1184758866

bhack avatar Aug 30 '22 22:08 bhack

@wangpengmit Can I ask you why you have mentioned @ishark 1 month ago? Is there a new API ownership plan?

Yeah, we still lack a primary owner of tf.vectorized_map, but @ishark is the secondary owner of it now.

wangpengmit avatar Aug 30 '22 23:08 wangpengmit

I've tried both TF Nightly and TF 2.9, and my time per epoch increases from 21 seconds on TF 2.8 to 9 minutes, a 25x slowdown. Removing all augmentations from my model restores the expected training time. This is a similar issue to #581, though perhaps worse since I am using a few different augmentations.

Can you post which layers youre using?

LukeWood avatar Aug 31 '22 00:08 LukeWood

Hi Luke,

Here is my augmentation code. I am using augment_inputs() to wrap multiple layers within my model, and model.fit() is being fed with a tf.data.DataSet. Please let me know if there is anything else I can provide.

def augment_inputs(inputs):
    inputs = tf.keras.layers.RandomZoom(height_factor=(-0.20, 0))(inputs)
    inputs = tf.keras.layers.RandomFlip()(inputs)
    inputs = tf.keras.layers.RandomRotation(factor=0.1)(inputs)
    inputs = tf.keras.layers.RandomTranslation(height_factor=0.1,
                                               width_factor=0.1)(inputs)
    return inputs

def build_model():
    ## Input 1

    old_inputs = layers.Input(shape=IMAGE_SIZE, name='input_oldest')
    # First input for 'data_a'
    inputs = scaler(old_inputs)
    inputs = augment_inputs(inputs)
    # .... rest of the model building code below 

TylerADavis avatar Aug 31 '22 00:08 TylerADavis

Hi Luke,

Here is my augmentation code. I am using augment_inputs() to wrap multiple layers within my model, and model.fit() is being fed with a tf.data.DataSet. Please let me know if there is anything else I can provide.

def augment_inputs(inputs):
    inputs = tf.keras.layers.RandomZoom(height_factor=(-0.20, 0))(inputs)
    inputs = tf.keras.layers.RandomFlip()(inputs)
    inputs = tf.keras.layers.RandomRotation(factor=0.1)(inputs)
    inputs = tf.keras.layers.RandomTranslation(height_factor=0.1,
                                               width_factor=0.1)(inputs)
    return inputs

def build_model():
    ## Input 1

    old_inputs = layers.Input(shape=IMAGE_SIZE, name='input_oldest')
    # First input for 'data_a'
    inputs = scaler(old_inputs)
    inputs = augment_inputs(inputs)
    # .... rest of the model building code below 

I think I see the problem, you’re creating a new layer every function call. So you’ll retrace every single call - whcih is expensive. Create layers outside the function

LukeWood avatar Aug 31 '22 00:08 LukeWood

Hello, my apologies, I missed github notifications. Will go through the issue history and get back to you. Thanks.

ishark avatar Aug 31 '22 00:08 ishark

@LukeWood Thank you for the recommendation. I rewrote my augmentation following your recommendation to create the layers outside of the function, as shown below, but I am still seeing the 9 minute epochs, versus the 21 second epochs I expect.

def build_model():
    backbone = tf.keras.applications.efficientnet_v2.EfficientNetV2B0(
        weights='imagenet', input_shape=IMAGE_SIZE, include_top=False)
    backbone.trainable = False

    ## Input 1

    old_inputs = layers.Input(shape=IMAGE_SIZE, name='input_oldest')
    # First input for 'data_a'
    inputs = scaler(old_inputs)
    # inputs = augment_inputs(inputs)
    inputs = tf.keras.layers.RandomZoom(height_factor=(-0.20, 0))(inputs)
    inputs = tf.keras.layers.RandomFlip()(inputs)
    inputs = tf.keras.layers.RandomRotation(factor=0.1)(inputs)
    inputs = tf.keras.layers.RandomTranslation(height_factor=0.1,
                                               width_factor=0.1)(inputs)

    oldest_outputs = tf.keras.layers.GlobalAveragePooling2D()(backbone(inputs, training=False))
    #... rest of the model

    return model, backbone


model, backbone = build_model()
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.01),
              loss=tf.keras.losses.BinaryCrossentropy(),
              metrics=METRICS)

model.fit(train,
          validation_data=validation,
          epochs=50

TylerADavis avatar Aug 31 '22 00:08 TylerADavis

If you, like me, is still facing this issue, my "workaround" is:

pip uninstall keras -y
pip install keras==2.8.0

Mirandatz avatar Feb 02 '23 03:02 Mirandatz

Hey @Mirandatz - we rolled back changes to the KPLs in the most recent keras release, and we'll be fixing them forwards in KerasCV! Sorry for any inconvenience.

LukeWood avatar Feb 02 '23 04:02 LukeWood