keras icon indicating copy to clipboard operation
keras copied to clipboard

Unpredictable behaviour of tracked variables in subclassing layers and models

Open JaimeArboleda opened this issue 2 years ago • 9 comments

I've found that the tracking of variables by keras is not 100% predictable, and this has an impact because untracked variables are not trained.

This is a google colab notebook with code to test the issue (please have in mind that the example is a toy one, the layer and model make no sense, only to illustrate the issue):

https://colab.research.google.com/drive/108tvoGVHDTHuoMP39ZsPKUbSsnO_jRYr#scrollTo=4ikYkJzxWuHy

You can see there that when subclassing a Keras Layer:

  • You can define variables in a list using list comprehension.
  • You can define other keras Layers in a list starting with an empty list and using append.
  • You cannot define varables in a list starting with an empty list and using append.

And when subclassing a Keras Model:

  • You can define other keras Layers in a list starting with an empty list and using append.
  • You can define varaibles in a list starting with an empty list and using append.

Personally, I find the situation very unpredictable and strange.

Note: If instead of tf.Variable you use self.add_weight, it works in every case.

JaimeArboleda avatar Jul 18 '22 05:07 JaimeArboleda

@gadagashwini I was able to replicate the issue on colab, please find the gist here. Thank you!

sushreebarsa avatar Jul 19 '22 02:07 sushreebarsa

It looks like the only issue is with defining variables in a list starting with an empty list and append-ing. Would you agree with this description of the issue, and would you be interested in contributing a fix?

mihirparadkar avatar Jul 28 '22 17:07 mihirparadkar

I agree with the description: as far as I have noticed, it's the only case where some variables are lost.

I would like to contribute, but I'm afraid my knowledge of Keras internals is poor. I don't think I would be able to contribute much... I've been taking a look at the code and I think I've found a comment that might be relevant (if it helps):

In line 3127 of this file: https://github.com/keras-team/keras/blob/master/keras/engine/base_layer.py

It says:

# TODO(b/125122625): This won't pick up on any variables added to a # list/dict after creation.

But again, I don't have a big enough picture of the Keras internals as to contribuyte myself. I'm sorry for that.

JaimeArboleda avatar Jul 29 '22 04:07 JaimeArboleda

No worries, thanks for letting us know. Given the priority we think this issue has, we'll keep it opened as a community contribution item.

rchao avatar Jul 29 '22 18:07 rchao

I agree with the low priority. May I suggest if we could just reflect that this is the behaviour in the documentation? I believe it's ok as long as it is stated somewhere and it does not create "unexpected surprises" that can be difficult to detect (if you don't check the training weights, the model will compile and train, but the performance will be worse -depending on how many variables are not optimized-).

JaimeArboleda avatar Aug 01 '22 10:08 JaimeArboleda

Agreed - since you've been just looking into this, mind sending us a PR for the documentation updates?

rchao avatar Aug 01 '22 23:08 rchao

Ok, done:

https://github.com/keras-team/keras/pull/16858

It's the first PR in my life, so if I did something wrong please tell me :)

JaimeArboleda avatar Aug 02 '22 10:08 JaimeArboleda

Took a look for the PR and the repro, it is bit weird that the layer and model are treating and tracking the weights differently. From API perspective, the subclass model is just a subclass layer with additional training logic. The weights tracking behavior should be the same between model and layer. I think we should fix this on the keras side.

qlzh727 avatar Aug 02 '22 16:08 qlzh727

Yes, this makes a lot of sense. Probably inspecting how the Model tracks the weights in this situation should give a clue about how to fix the Layer class. I will try to take a look and see if I can do it.

JaimeArboleda avatar Aug 03 '22 07:08 JaimeArboleda