TensorFlow.NET
TensorFlow.NET copied to clipboard
[BUG Report]: Problems with deserialize_keras_object / deserialize_model_config
Description
Hi all, I'm trying to use json serialization and deserialization as a simple way of sharing models between applications, but I think i run into a strange bug. As far I understood, it seems that the activation function is ignored when deserializing the config object, and linear activation is always taken. The code below is a minimal reproducion example.
Reproduction Steps
In the following code I costruct two models with one convolutional layer, the first with activation function relu, and the second with linear activation. Then I serialize the first one (with relu) and deserialize it again to get a third model that should actually be the same as the relu one. Instead, the behaviour seems more similar to the linear one. It looks like that the activation function specified in the model is lost while serializing and/or deserializing.
var layers = new LayersApi();
Tensorflow.Keras.Models.ModelsApi models = new Tensorflow.Keras.Models.ModelsApi();
var inputs = keras.Input(shape: new Shape(4, 4, 1));
//Create model with convolutional layer with activation = relu
var sequential_relu = keras.Sequential(name: "Sequential_relu");
sequential_relu.add(inputs);
sequential_relu.add(layers.Conv2D(1, new Shape(3, 3), activation: "relu"));
//Create model with convolutional layer with activation = linear
var sequential_linear = keras.Sequential(name: "Sequential_linear");
sequential_linear.add(inputs);
sequential_linear.add(layers.Conv2D(1, new Shape(3, 3), activation: "linear"));
//Serialize model with activation = relu
JObject mySerializedModel = Tensorflow.Keras.Utils.generic_utils.serialize_keras_object(sequential_relu);
//Deserialize back into a third model
JToken mySerializedConfig = mySerializedModel.GetValue("config");
Tensorflow.Keras.Saving.ModelConfig myConfig = generic_utils.deserialize_model_config(mySerializedConfig);
var sequential_json = keras.Sequential(name: "Sequential_json");
if (mySerializedModel.GetValue("class_name").ToObject<string>() == "Sequential")
{
for (int i = 0; i < myConfig.Layers.Count; i++)
{
Layer layer = generic_utils.deserialize_keras_object(myConfig.Layers[i].ClassName, myConfig.Layers[i].Config);
sequential_json.add(layer);
}
}
//Use same weights for all models (to be able to compare outputs)
for (int i = 0; i < myConfig.Layers.Count; i++)
{
sequential_linear.Weights[i].assign(sequential_relu.Weights[i]);
sequential_linear.Weights[i].assign(sequential_json.Weights[i]);
}
//Apply all models to arbitrary input and print results
float[] inputArray = new float[16];
for (int i = 0; i < inputArray.Length; i++)
inputArray[i] = 0.5f;
NDArray input = tf.convert_to_tensor(inputArray).numpy();
input = input.reshape(new Shape(1, 4, 4, 1));
var output = sequential_relu.Apply(input);
System.Diagnostics.Debug.WriteLine("\nSequential_relu output: " + output[0].numpy().ToString());
output = sequential_linear.Apply(input);
System.Diagnostics.Debug.WriteLine("\nSequential_linear output: " + output[0].numpy().ToString());
output = sequential_json.Apply(input);
System.Diagnostics.Debug.WriteLine("\nSequential_json output: " + output[0].numpy().ToString());
This is the outputs I get:
Sequential_relu output: array([[[[0,2201839],
[0,2201839]],
[[0,2201839],
[0,2201839]]]])
Sequential_linear output: array([[[[0,1164095],
[0,1164095]],
[[0,1164095],
[0,1164095]]]])
Sequential_json output: array([[[[0,1164095],
[0,1164095]],
[[0,1164095],
[0,1164095]]]])
There is probably a random seed applied somewhere, so the results can actually vary, but for Sequential_json I always get the same results as Sequential_linear (instead of Sequential_relu as i would expect). Am I right? Or am I missing something?
Known Workarounds
No response
Configuration and Other Information
Tensorflow.NET v.100.4.
Hi, thank you for reporting us it. I ran your code under master branch and get exactly the same results of the three models. The reason that you got different results could be one of the following:
- The problem exists in v0.100.4 but has been fixed by a certain PR on the master branch. (Forgive me that I can't tell which PR it is because recently tf.net is under quick development). If this case matches, please use the master branch as a work around.
- I noticed that your code may have a little mistake. When assigning the weights, I wonder if
sequential_linear.Weights[i].assign(sequential_json.Weights[i]);
should besequential_json.Weights[i].assign(sequential_relu.Weights[i]);
to keep their weights the same. If this case matches, v0.100.4 could still work for you.
Hi, thank you for the suggestions. You're right about point 2, that's a typo. Anyway, I'm pretty sure the behaviour in v0.100.4 is the one I described above.
Instead, I tried with latest master code, and I get more consinstent results, so apparently the problem has already been fixed on the master branch.