TensorFlow.NET
TensorFlow.NET copied to clipboard
GradientTape.gradient returning null
I am porting the Keras Actor Critic Method to Tensforflow.net and when I attempt to calculate the gradients it returns null.
using (var tape = tf.GradientTape())
{
// ....
// relevant code
float loss_value = actor_losses_sum + critic_losses_sum;
Tensor loss_value_tensor = tf.convert_to_tensor(loss_value);
var grads = tape.gradient(loss_value_tensor, model.trainable_variables);
}
I have posted the full source code to reproduce the issue on this [Gist].(https://gist.github.com/alexhiggins732/320286f89e53c3bb3ae291f5979db1f3)
I have research and found several example saying tape.watch
may need to be called but I have tried "watching" several tensors with no success.
I am porting the Keras Actor Critic Method to Tensforflow.net and when I attempt to calculate the gradients it returns null.
Did you find any solution?
I am porting the Keras Actor Critic Method to Tensforflow.net and when I attempt to calculate the gradients it returns null.
Did you find any solution?
No.
Hey, I have encountered the same issue and wanted to show how I solved it for anybody that might end up here.
I encountered it while trying to port the Deep Deterministic Policy Gradient which is very related to what @alexhiggins732 has tried.
This is the python code from the Keras example:
which translated to this in Tensorflow.NET:
You need to take special care of the loss. Take a look of what happens to var y. This is because the shape was not the same as what the critic_model expected and so the critic_loss was ultimately of a bad size which resulted in a null gradient. In this particular case the y would not be known once the gradient is computed and this resulted in another Error once the code gets actually computed. This was solved with tape.watch(y).
Basically what you want to see is something like this where the shapes of the Tensors all align nicely:
There must be a bug here, the variable is not automatically captured. Is it possible to provide a minimal unit test that reproduces this behavior? It's easier for me to fix things that way.
Sure thing, here you go:
using Tensorflow;
using Tensorflow.Keras.Engine;
using static Tensorflow.Binding;
using static Tensorflow.KerasApi;
using Tensorflow.NumPy;
public Model get_actor(int num_states)
{
var inputs = keras.layers.Input(shape: num_states);
var outputs = keras.layers.Dense(1, activation: keras.activations.Tanh).Apply(inputs);
Model model = keras.Model(inputs, outputs);
return model;
}
public Model get_critic(int num_states, int num_actions)
{
// State as input
var state_input = keras.layers.Input(shape: num_states);
// Action as input
var action_input = keras.layers.Input(shape: num_actions);
var concat = keras.layers.Concatenate(axis: 1).Apply(new Tensors(state_input, action_input));
var outputs = keras.layers.Dense(1).Apply(concat);
Model model = keras.Model(new Tensors(state_input, action_input), outputs);
model.summary();
return model;
}
[TestMethod]
public void GetGradient_Test()
{
var numStates = 3;
var numActions = 1;
var batchSize = 64;
var gamma = 0.99f;
var target_actor_model = get_actor(numStates);
var target_critic_model = get_critic(numStates, numActions);
var critic_model = get_critic(numStates, numActions);
Tensor state_batch = tf.convert_to_tensor(np.zeros((batchSize, numStates)), TF_DataType.TF_FLOAT);
Tensor action_batch = tf.convert_to_tensor(np.zeros((batchSize, numActions)), TF_DataType.TF_FLOAT);
Tensor reward_batch = tf.convert_to_tensor(np.zeros((batchSize, 1)), TF_DataType.TF_FLOAT);
Tensor next_state_batch = tf.convert_to_tensor(np.zeros((batchSize, numStates)), TF_DataType.TF_FLOAT);
using (var tape = tf.GradientTape())
{
var target_actions = target_actor_model.Apply(next_state_batch, training: true);
var target_critic_value = target_critic_model.Apply(new Tensors(new Tensor[] { next_state_batch, target_actions }), training: true);
// this works
// var y = reward_batch + target_critic_value;
// this only works with tape.watch
var y = reward_batch + tf.multiply(gamma, target_critic_value);
//tape.watch(y);
var critic_value = critic_model.Apply(new Tensors(new Tensor[] { state_batch, action_batch }), training: true);
var critic_loss = math_ops.reduce_mean(math_ops.square(y - critic_value));
var critic_grad = tape.gradient(critic_loss, critic_model.TrainableVariables);
Assert.IsNotNull(critic_grad);
Assert.IsNotNull(critic_grad.First());
}
@LedStarr Thank you for your help and we have submit a fix for it. However we noticed that when running your test case, it throws an exception when getting the gradient instead of returning null, as described in this issue. Is it the same with your local environment?
@AsakusaRinne yes, sorry for the confusion, I should have pointed that out a bit more. I do get the Exception in my environment as well.
Here is a test case resulting in a null value within the gradient. I think this is similar to what @alexhiggins732 encountered.
[TestMethod]
public void GetGradient_Test()
{
var numStates = 3;
var numActions = 1;
var critic_model = get_critic(numStates, numActions);
float loss_value = 0.1f;
using (var tape = tf.GradientTape())
{
var critic_loss = tf.convert_to_tensor(loss_value);
var critic_grad = tape.gradient(critic_loss, critic_model.TrainableVariables);
Assert.IsNotNull(critic_grad);
Assert.IsNotNull(critic_grad.First());
}
}
@LedStarr Thank you a lot for your test case. I wrote it's python version and run it. However I found that in python it also get [None, None]
as gradient. The critic_loss is a constant instead of the result of critic_model
, I guess returning null in this case is reasonable?
The python version code is here:
import tensorflow as tf
def get_critic(num_states, num_actions):
state_input = tf.keras.layers.Input(num_states)
action_input = tf.keras.layers.Input(num_actions)
concat = tf.keras.layers.Concatenate(1)([state_input, action_input])
outputs = tf.keras.layers.Dense(1)(concat)
model = tf.keras.Model([state_input, action_input], outputs)
model.summary()
return model
if __name__ == '__main__':
num_states = 3
num_actions = 1
loss_value = 0.1
with tf.GradientTape() as tape:
critic_model = get_critic(num_states, num_actions)
critic_loss = tf.convert_to_tensor(loss_value)
critic_grad = tape.gradient(critic_loss, critic_model.trainable_variables)
print(critic_grad)
@AsakusaRinne hmm if this is the case then I also think this is reasonable. The critic_loss is a constant because I wanted to show that this is happening when you call tf.convert_to_tensor(loss_value) within the with statement of the tape. It should not be a constant in a regular scenario of course. I thought this problem only occurs when you create a tensor from a single float value within the tape sequence.
Anyways if the python code is returning a null here as well, then I don't consider it as a bug of tf.net :)
@LedStarr Thank you very much for your help in this issue! :)
The original tensorflow code in python is:
loss_value = sum(actor_losses) + sum(critic_losses)
grads = tape.gradient(loss_value, model.trainable_variables)
optimizer.apply_gradients(zip(grads, model.trainable_variables))
The Keras Actor Critic does not return null.
What is the difference?
@alexhiggins732 it might be that the type or shape is different. Can you compare the loss_value variable in both python and c# code?