keras icon indicating copy to clipboard operation
keras copied to clipboard

save/load some specific models with bfloat datatype will lead to a crash

Open maybeLee opened this issue 3 years ago • 1 comments

System information.

  • Have I written custom code (as opposed to using a stock example script provided in Keras): yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): 2.9.0, nightly
  • Python version: 3.7
  • Bazel version (if compiling from source): N/A
  • GPU model and memory: N/A
  • Exact command to reproduce: https://colab.research.google.com/drive/1Iu5LTimgYKmYnAq58eZT8wJ0d7lPnXVW?usp=sharing

Describe the problem. I want to generate a model with Conv2D layer with bfloat16 datatype. After I successfully generate it and save the model to h5 format, I will cause an error if I want to load h5 model again. I find this problem occur in many layers: Conv2D, Conv1D, SeparableConv2D while some layers can successfully be saved and loaded again with bfloat16 datatype: ReLU, BatchNormalization.

Describe the expected behavior. It would be much better if I can load a h5 model containing Conv2D layer with h5 format.

Contributing.

  • Do you want to contribute a PR? (yes/no): yes
  • If yes, please read this page for instructions
  • Briefly describe your candidate solution(if contributing):

Standalone code to reproduce the issue.

import keras


x = keras.layers.Input((224,224,3), dtype="bfloat16")
buggy_layer = keras.layers.Conv2D(3,3,dtype="bfloat16")
y = buggy_layer(x)
model = keras.models.Model(x,y)
model.summary()
model.save("temp.h5")

model_name = "temp.h5"
model = keras.models.load_model(model_name)

I test on multiple layers, you can access the result here: https://colab.research.google.com/drive/1Iu5LTimgYKmYnAq58eZT8wJ0d7lPnXVW?usp=sharing

Source code / logs.

Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 conv2d (Conv2D)             (None, 222, 222, 3)       84        
                                                                 
=================================================================
Total params: 84
Trainable params: 84
Non-trainable params: 0
_________________________________________________________________
WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
[<ipython-input-1-23b06202853b>](https://1kca3w6otaf-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab-20220921-060051-RC00_475785192#) in <module>
     10 
     11 model_name = "temp.h5"
---> 12 model = keras.models.load_model(model_name)

1 frames
[/usr/local/lib/python3.7/dist-packages/keras/backend.py](https://1kca3w6otaf-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab-20220921-060051-RC00_475785192#) in batch_set_value(tuples)
   4301     if tf.executing_eagerly() or tf.inside_function():
   4302         for x, value in tuples:
-> 4303             x.assign(np.asarray(value, dtype=dtype_numpy(x)))
   4304     else:
   4305         with get_graph().as_default():

ValueError: No cast function available.

maybeLee avatar Sep 23 '22 07:09 maybeLee

@gowthamkpr, I was able to reproduce the issue on tensorflow v2.8, v2.9 and nightly. Kindly find the gist of it here.

tilakrayal avatar Sep 24 '22 10:09 tilakrayal