keras icon indicating copy to clipboard operation
keras copied to clipboard

Feature request: optimizer support with slot variables in different dtype to the variable.

Open georgepaw opened this issue 2 years ago • 5 comments

Currently an assumption is made that slot variables have the same dtype as the variable, which prevents experimentation.

From digging around a solution is to:

  1. Extend SlotVariableReference proto in tensorflow/core/protobuf/trackable_object_graph.proto to include the dtype and set that field when saving/loading trackable objects in TF
  2. Extend the add_slot function to take dtype=None as a kwarg and then set the dtype of the created variable to dtype or var.dtype to preserve backwards compatibility.
  3. Set the dtype arg when calling add_slot in tensorflow/python/saved_model/load.py

More than happy to implement this, especially if you guys are happy with the proposed solution here.

georgepaw avatar May 05 '22 08:05 georgepaw

@georgepaw , Can you please share a Use-Case that supports your statement so that the issue can be easily understood? Thanks!

tilakrayal avatar May 06 '22 02:05 tilakrayal

One could write an optimizer (for example Adam) for a model which has the weights and gradients in fp16, but the slot variables might have to be in higher precision - for example fp32

georgepaw avatar May 06 '22 06:05 georgepaw

Add Chen for more inputs.

qlzh727 avatar May 11 '22 22:05 qlzh727

Any thoughts on the proposed solution?

georgepaw avatar May 20 '22 13:05 georgepaw

Sorry for missing this issue (somehow it slipped away).

I like the idea of flexible dtype, actually in the experimental Keras optimizer, there is a method add_variable that suits your needs, could you check if that works?

Thanks, and sorry again!

chenmoneygithub avatar Aug 09 '22 03:08 chenmoneygithub