keras
keras copied to clipboard
Feature request: optimizer support with slot variables in different dtype to the variable.
Currently an assumption is made that slot variables have the same dtype as the variable, which prevents experimentation.
From digging around a solution is to:
- Extend
SlotVariableReference
proto intensorflow/core/protobuf/trackable_object_graph.proto
to include the dtype and set that field when saving/loading trackable objects in TF - Extend the
add_slot
function to takedtype=None
as a kwarg and then set the dtype of the created variable todtype or var.dtype
to preserve backwards compatibility. - Set the dtype arg when calling
add_slot
intensorflow/python/saved_model/load.py
More than happy to implement this, especially if you guys are happy with the proposed solution here.
@georgepaw , Can you please share a Use-Case that supports your statement so that the issue can be easily understood? Thanks!
One could write an optimizer (for example Adam) for a model which has the weights and gradients in fp16, but the slot variables might have to be in higher precision - for example fp32
Add Chen for more inputs.
Any thoughts on the proposed solution?
Sorry for missing this issue (somehow it slipped away).
I like the idea of flexible dtype, actually in the experimental Keras optimizer, there is a method add_variable
that suits your needs, could you check if that works?
Thanks, and sorry again!