TensorFlow.NET icon indicating copy to clipboard operation
TensorFlow.NET copied to clipboard

[BUG Report]: Problems with deserialize_keras_object / deserialize_model_config

Open FrancescoRusticali opened this issue 1 year ago • 2 comments

Description

Hi all, I'm trying to use json serialization and deserialization as a simple way of sharing models between applications, but I think i run into a strange bug. As far I understood, it seems that the activation function is ignored when deserializing the config object, and linear activation is always taken. The code below is a minimal reproducion example.

Reproduction Steps

In the following code I costruct two models with one convolutional layer, the first with activation function relu, and the second with linear activation. Then I serialize the first one (with relu) and deserialize it again to get a third model that should actually be the same as the relu one. Instead, the behaviour seems more similar to the linear one. It looks like that the activation function specified in the model is lost while serializing and/or deserializing.

    var layers = new LayersApi();
    Tensorflow.Keras.Models.ModelsApi models = new Tensorflow.Keras.Models.ModelsApi();

    var inputs = keras.Input(shape: new Shape(4, 4, 1));
            
    //Create model with convolutional layer with activation = relu
    var sequential_relu = keras.Sequential(name: "Sequential_relu");
    sequential_relu.add(inputs);
    sequential_relu.add(layers.Conv2D(1, new Shape(3, 3), activation: "relu"));

    //Create model with convolutional layer with activation = linear
    var sequential_linear = keras.Sequential(name: "Sequential_linear");
    sequential_linear.add(inputs);
    sequential_linear.add(layers.Conv2D(1, new Shape(3, 3), activation: "linear"));

    //Serialize model with activation = relu
    JObject mySerializedModel = Tensorflow.Keras.Utils.generic_utils.serialize_keras_object(sequential_relu);

    //Deserialize back into a third model
    JToken mySerializedConfig = mySerializedModel.GetValue("config");
    Tensorflow.Keras.Saving.ModelConfig myConfig = generic_utils.deserialize_model_config(mySerializedConfig);
    var sequential_json = keras.Sequential(name: "Sequential_json");
    if (mySerializedModel.GetValue("class_name").ToObject<string>() == "Sequential")
    {
        for (int i = 0; i < myConfig.Layers.Count; i++)
        {
            Layer layer = generic_utils.deserialize_keras_object(myConfig.Layers[i].ClassName, myConfig.Layers[i].Config);
            sequential_json.add(layer);
        }
    }

    //Use same weights for all models (to be able to compare outputs)
    for (int i = 0; i < myConfig.Layers.Count; i++)
    {
        sequential_linear.Weights[i].assign(sequential_relu.Weights[i]);
        sequential_linear.Weights[i].assign(sequential_json.Weights[i]);
    }

    //Apply all models to arbitrary input and print results
    float[] inputArray = new float[16];

    for (int i = 0; i < inputArray.Length; i++)
        inputArray[i] = 0.5f;

    NDArray input = tf.convert_to_tensor(inputArray).numpy();
    input = input.reshape(new Shape(1, 4, 4, 1));

    var output = sequential_relu.Apply(input);
    System.Diagnostics.Debug.WriteLine("\nSequential_relu output: " + output[0].numpy().ToString());
    output = sequential_linear.Apply(input);
    System.Diagnostics.Debug.WriteLine("\nSequential_linear output: " + output[0].numpy().ToString());
    output = sequential_json.Apply(input);
    System.Diagnostics.Debug.WriteLine("\nSequential_json output: " + output[0].numpy().ToString());

This is the outputs I get:

Sequential_relu output: array([[[[0,2201839], 
[0,2201839]], 
[[0,2201839], 
[0,2201839]]]])

Sequential_linear output: array([[[[0,1164095], 
[0,1164095]], 
[[0,1164095], 
[0,1164095]]]])

Sequential_json output: array([[[[0,1164095], 
[0,1164095]], 
[[0,1164095], 
[0,1164095]]]])

There is probably a random seed applied somewhere, so the results can actually vary, but for Sequential_json I always get the same results as Sequential_linear (instead of Sequential_relu as i would expect). Am I right? Or am I missing something?

Known Workarounds

No response

Configuration and Other Information

Tensorflow.NET v.100.4.

FrancescoRusticali avatar Apr 26 '23 15:04 FrancescoRusticali

Hi, thank you for reporting us it. I ran your code under master branch and get exactly the same results of the three models. The reason that you got different results could be one of the following:

  1. The problem exists in v0.100.4 but has been fixed by a certain PR on the master branch. (Forgive me that I can't tell which PR it is because recently tf.net is under quick development). If this case matches, please use the master branch as a work around.
  2. I noticed that your code may have a little mistake. When assigning the weights, I wonder if sequential_linear.Weights[i].assign(sequential_json.Weights[i]); should be sequential_json.Weights[i].assign(sequential_relu.Weights[i]); to keep their weights the same. If this case matches, v0.100.4 could still work for you.

AsakusaRinne avatar Apr 26 '23 19:04 AsakusaRinne

Hi, thank you for the suggestions. You're right about point 2, that's a typo. Anyway, I'm pretty sure the behaviour in v0.100.4 is the one I described above.

Instead, I tried with latest master code, and I get more consinstent results, so apparently the problem has already been fixed on the master branch.

FrancescoRusticali avatar Apr 28 '23 13:04 FrancescoRusticali