hls4ml icon indicating copy to clipboard operation
hls4ml copied to clipboard

Support for Keras v3 file format

Open vloncar opened this issue 1 year ago • 1 comments

Prerequisites

Keras has introduced a new file format called "Keras v3" in recent versions of TF (2.13 or 2.12, I don't remember), that is now preferred over the h5 and the SavedModel formats. This is part of a wider set of changes coming in Keras v3 that will come out later this year and become the version TF uses internally.

Details

The new format uses the extension .keras and it seems to be a zip archive of 3 components: the model json, the model metadata (versions of tools) and the weights in the familiar h5 format.

New behavior

With this change, hls4ml will be compatible with the latest version of TF andthe ecosystem will widen quite a bit.

Motivation

Keras v3 will be a major milestone and that will reintroduce support for additional backends, namely PyTorch and Jax. More here.

Parts of hls4ml being affected

The Keras converter will have to be extended. Should be a low impact on the existing codebase, since we will have to extract the JSON config and the h5 of the weights from the .keras file and proceed with the parse as before. The format of the JSON has changed slightly, but the attributes of the layer config (obtained with Layer.get_config()) remain the same or have useful extra features (like shape info), meaning all handlers should work unmodified. We will only need to extract the class name differently (it is now called "module") and optionally use the new information from the layer config.

vloncar avatar Jul 12 '23 02:07 vloncar

I had a quick go at this - there are a few non-trivial implementation details, that need to be considered, compared to the current .h5 file reader. Documentation available at: https://keras.io/api/saving/model_saving_and_loading/

  • Layer weights in the file model.weights.h5 and layer names, from model.summary() or JSON config, are not necessairly the same. To see this run, the code below with the two lines commented out and then uncomment them. It seems that if a layer is called e.g. dense or conv the keys in the .h5 weight file and in the JSON configuration will match. However, when a layer is called dense_1, the key in the .h5 file is renamed. Worth looking how Keras handles this internally.
  • The current Keras readers look for tensors called kernel and bias in the .h5 files; with Keras V3 this names are omitted and changed to 0, 1 and so on. To avoid rewriting all the layer parsers, it is worth abstracting this implementation detail somehow.
    import h5py
    import zipfile
    from keras.models import Sequential
    from keras.layers import Dense, Conv2D, Flatten, ReLU, LeakyReLU
    
    keras_model = Sequential()
    keras_model.add(Conv2D(8, (3, 3), input_shape=input_shape))
    keras_model.add(ReLU())
    # keras_model.add(Conv2D(16, (3, 3)))
    keras_model.add(Flatten())
    # keras_model.add(Dense(24))
    keras_model.add(Dense(5))
    keras_model.add(LeakyReLU())
    keras_model.summary()
    keras_model.save('model.keras')
    
    saved_model = zipfile.ZipFile('model.keras', 'r')
    model_weights = h5py.File(saved_model.open('model.weights.h5'), 'r')
    print(list(model_weights['_layer_checkpoint_dependencies'].items()))

bo3z avatar Jul 19 '23 12:07 bo3z