KeyError, when calling profiling.compare

Open JochiSt opened this issue 3 years ago • 0 comments

Prerequisites

Please make sure to check off these prerequisites before submitting a bug report.

[x] Test that the bug appears on the current version of the master branch. Make sure to include the commit hash of the commit you checked out.
[x] Check that the issue hasn't already been reported, by checking the currently open issues.
[x] If there are steps to reproduce the problem, make sure to write them down below.
[ ] If relevant, please include the hls4ml project files, which were created directly before and/or after the bug.

Quick summary

When I call hls4ml.model.profiling.compare(model, hls_model, y_test) - with the norm_diff option - on a very simple fully dense network, I get the following error:

Traceback (most recent call last):
  File "traceHLSmodel.py", line 75, in <module>
    compareHLSmodel(model)
  File "traceHLSmodel.py", line 60, in compareHLSmodel
    compare_fig = hls4ml.model.profiling.compare(model, hls_model,
  File "/home/fpga_ai/venvs/HLS4ML3.8/src/hls4ml/hls4ml/model/profiling.py", line 690, in compare
    f = _norm_diff(ymodel, ysim)
  File "/home/fpga_ai/venvs/HLS4ML3.8/src/hls4ml/hls4ml/model/profiling.py", line 601, in _norm_diff
    diff[key] = np.linalg.norm(ysim[key]-ymodel[key])
KeyError: 'layer_0_relu'

I get a similar error, when using the dist_diff plot option.

Details

The network, which I used:

    inputs = keras.Input(shape=(120,), name="waveform_input")

    layer_cnt=0
    x = keras.layers.Dense(8,
                            activation="relu",
                            #kernel_regularizer=keras.regularizers.l1(0.00001),
                            name="layer_%d"%(layer_cnt))(inputs)
    layer_cnt+=1

    x = keras.layers.Dense(8,
                            activation="relu",
                            #kernel_regularizer=keras.regularizers.l1(0.00001),
                            name="layer_%d"%(layer_cnt))(x)
    layer_cnt+=1

    x = keras.layers.Dense(6,
                            activation="relu",
                            #kernel_regularizer=keras.regularizers.l1(0.00001),
                            name="layer_%d"%(layer_cnt))(x)
    layer_cnt+=1

    # final layer for classification
    outputs = keras.layers.Dense(2, name="regression")(x)

I trained the network saved it to disk and reloaded it and send it into the compare function to see, where the differences between the HLS and the Keras model are.

Steps to Reproduce

Add what needs to be done to reproduce the bug. Add commented code examples and make sure to include the original model files / code, and the commit hash you are working on.

Clone the hls4ml repository
Checkout the master branch, with commit hash: a4b0e0c34a84d252559cac4a9f2f98e699964674
System setup:

[GCC 9.3.1 20200408 (Red Hat 9.3.1-2)]
Numpy 1.22.3
TensorFlow 2.8.0
Keras 2.8.0
QKeras 0.9.0
HLS4ML 0.6.0.dev217+ga4b0e0c

Run conversion on model file with code (see above)

Expected behavior

I would expect a plot of the differences in the layers

Actual behavior

Throws a python KeyError instead.

Optional

Possible fix

It seems, that the Keras layer name for the activation part is not always ending with _relu. For my regression network the last layer has the ending _linear and the others have the ending _function. But I've no clue, why my layer names are different from the ones assumed in HLS4ML.

Mar 21 '23 16:03 JochiSt