tensorflow icon indicating copy to clipboard operation
tensorflow copied to clipboard

Gradient checkpointing: Wrap `tf.keras.Model` or `tf.keras.layers.Layer` in `tf.recompute_gradients()`

Open jarednielsen opened this issue 6 years ago • 4 comments

I'm implementing gradient checkpointing with my Tensorflow 2.1 project, following the doc here: https://www.tensorflow.org/api_docs/python/tf/recompute_grad

When I try

model = tf.recompute_grad(model)

it fails, and likewise with

model.layer = tf.recompute_grad(model.layer)

I see that Keras acts differently because tf.recompute_grad() wraps the function with an inner() call, but what should I do to get it working?

jarednielsen avatar Feb 21 '20 19:02 jarednielsen

+1 to this question, I'm doing something similar with tf-nightly==2.2.0.dev20200119 with custom layers inherited from Keras layers and the docs could use some clarification on "keep a reference to the underlying object around for the purpose of accessing these variables"

mathemakitten avatar Mar 02 '20 22:03 mathemakitten

+1 to this. would be great to know how to make use of both keras layers and the gradient checkpointing functionality.

nickfrosst avatar Mar 03 '20 21:03 nickfrosst

Hi,

Thank you for opening this issue. Since this issue has been open for a long time, the code/debug information for this issue may not be relevant with the current state of the code base.

The Tensorflow team is constantly improving the framework by fixing bugs and adding new features. We suggest you try the latest TensorFlow version with the latest compatible hardware configuration which could potentially resolve the issue. If you are still facing the issue, please create a new GitHub issue with your latest findings, with all the debugging information which could help us investigate.

Please follow the release notes to stay up to date with the latest developments which are happening in the Tensorflow space.

Venkat6871 avatar Jul 26 '24 08:07 Venkat6871

This issue is stale because it has been open for 7 days with no activity. It will be closed if no further activity occurs. Thank you.

github-actions[bot] avatar Aug 03 '24 01:08 github-actions[bot]

This issue was closed because it has been inactive for 7 days since being marked as stale. Please reopen if you'd like to work on this further.

github-actions[bot] avatar Aug 10 '24 01:08 github-actions[bot]