keras Called a function referencing variables which have been deleted. `layers.RandomCrop` with Subclass API

System information.

TensorFlow version (use command below): 2.8
Environments: Colab

Describe the problem

I've written a model with sub-class API. In the model, I've used augmentation layers, for example, random_flip, random_crop etc. Now the model is trained without any issue. After finishing the training, I saved the model with model.save(...) and it saved without any issue.

Now, the problem arises when I reload the saved model and try to predict with it. It gives

Called a function referencing variables that have been deleted. 
This likely means that function-local variables were created and not referenced 
elsewhere in the program. This is generally a mistake; consider storing variables in 
an object attribute on the first call.
    
Call arguments received:
 • args=('tf.Tensor(shape=(None, 28, 28, 1), dtype=float32)',)
• kwargs={'training': 'False'}

Describe the current behavior

The whole workflow just works fine with the sequential model but failed with sub-class API. I mean, if I use augmentation layers with the sequential model, it works fine but if I use augmentation layers with a sub-class model, it failed and gives the above error message.

But note, even with the subclass model, not all augmentation causes this issue, I only encounter such a thing with layers.RandomCrop. Interestingly, if I implement and use a custom random crop layer, instead of a built-in layer, the error doesn't occur.

Describe the expected behavior.

What does that error mean? Why is it occurring only with sub-class API with built-in some of the augmentation layers, ie layers.RandomCrop but not with custom random crop layer?

Contributing.

Do you want to contribute a PR? (yes/no): No

Standalone code to reproduce the issue.

Reproducible CODE.

Jun 05 '22 01:06 innat

@innat I tried to replicate the issue and faced a different error. Could you please find the gist here and let me know if I am missing something to reproduce the issue. Thank you!

Jun 07 '22 10:06 sushreebarsa

@sushreebarsa Thanks for running the test. You're using tf 2.9. I tried to run your gist with GPU, and I got

---------------------------------------------------------------------------
UnimplementedError                        Traceback (most recent call last)
[<ipython-input-5-97bbab2d8774>](https://localhost:8080/#) in <module>()
     38 
     39 model = Model()
---> 40 model(tf.ones(shape=(1, 28, 28, 1))).shape
     41 model.summary(expand_nested=True)

1 frames
[<ipython-input-5-97bbab2d8774>](https://localhost:8080/#) in call(self, inputs, training)
     28     def call(self, inputs, training=None):
     29         x = self.augment(inputs, training=training)
---> 30         x = self.conv1(x)
     31         x = self.maxp1(x)
     32         x = self.conv2(x)

UnimplementedError: Exception encountered when calling layer "conv2d" (type Conv2D).

DNN library is not found. [Op:Conv2D]

Call arguments received by layer "conv2d" (type Conv2D):
  • inputs=tf.Tensor(shape=(1, 32, 32, 1), dtype=float32)

Is it safe to test with tf 2.9 on colab?

Jun 07 '22 11:06 innat

The error message with CPU mode training,

 Unimplemented `tf.keras.Model.call()`: if you intend to create a `Model` with the Functional API, please provide `inputs` and `outputs` arguments. Otherwise, subclass `Model` with an overridden `call()` method.

is also a possible bug of tf 2.9. Clearly there's a call method in the subclass model!

Jun 07 '22 11:06 innat

@sushreebarsa I made another ticket regarding this behavior. Please check https://github.com/keras-team/keras/issues/16662

Could you please find the gist here and let me know if I am missing something to reproduce the issue.

And please use tf 2.8 to reproduce the issue.

Jun 07 '22 11:06 innat

@innat Thank you for the update! @gowthamkpr I was able to replicate the issue on colab using TF v2.8.0, please find the gist here for reference. Thank you!

Jun 07 '22 11:06 sushreebarsa

This bug appears to be fixed on 2.9 actually!

You can check that by changing your loading line to

new_model = keras.models.load_model('/content/my_awaesome_model', custom_objects={"Model": Model})

Hits the error you describe on 2.8, and everything works on 2.9. As for whether you should need to provide a custom object for a subclass model I am not sure (will check with others tomorrow), but let's follow up on the other bug you filed (https://github.com/keras-team/keras/issues/16662) for that.

I think this particular bug is now fixed on the latest tf release.

Thanks!

Jun 16 '22 02:06 mattdangerw

So, the conclusion is, that this is a bug for tf 2.8. And it's fixed in tf 2.9 but at reloading time, we need to use custom_objects. With tf 2.9 and using custom_objects, another issue (https://github.com/keras-team/keras/issues/16662) seems also solved.

Before closing the issue, do you think it's necessary to use a custom object here? and Can we make any workaround for 2.5<= tf <= 2.8? And also, how do interpret these error messages? Why does it happen?

Jun 16 '22 07:06 innat

Checked with people today, we should not need the custom objects for subclass model to work. So I think this bug can be marked fixed, but we will need to follow up on https://github.com/keras-team/keras/issues/16662, to figure out that side of things.

I'm not sure what the best work around for 2.8 would be. Would switching to a functional model work? I don't think this is the sort of fix we would try to cherry pick back for 2.8, especially given that the underlying implementation has been changing dramatically, so we would likely need a separate solution.

Jun 16 '22 17:06 mattdangerw

keras keras copied to clipboard

Called a function referencing variables which have been deleted. `layers.RandomCrop` with Subclass API

Describe the problem

Describe the current behavior

keras
keras copied to clipboard