hls4ml Issue with predict and HLS Compilation

Hi,

I'm trying to implement this model using hls4ml. The model conversion seems to be working but when calling model.predict() with the test data I get:

python3: firmware/nnet_utils/nnet_conv2d_stream.h:73: void nnet::conv_2d_buffer_cl(hls::stream<srcType>&, 
hls::stream<dstType>&, typename CONFIG_T::weight_t*, typename CONFIG_T::bias_t*) 
[with data_T = nnet::array<ap_fixed<16, 6>, 2>; res_T = nnet::array<ap_fixed<16, 6>, 32>; CONFIG_T = config2; typename CONFIG_T::weight_t = ap_fixed<16, 6>; 
typename CONFIG_T::bias_t = ap_fixed<16, 6>]: 
Assertion `CONFIG_T::pad_top == 0 && CONFIG_T::pad_bottom == 0 && CONFIG_T::pad_left == 0 && CONFIG_T::pad_right == 0' failed.
Aborted (core dumped)

When I open the generated HLS Project, it looks fine and when using backend='VivadoAccelerator' the HLS Compilation actually completes without any errors, however it only takes a minute, Resource Usage is around 0% and the summary says ? for Latency and Interval. I assume maybe only the myproject_axi() wrapper was processed?

When changing the backend to backend='Vivado' and running the HLS Compilation, it stops with the error:

ERROR: [HLS 200-474] Empty dataflow region in myproject (it may have been optimized away due to the absence of outputs)
ERROR: [HLS 200-70] Failed building synthesis data model.
command 'ap_source' returned error code
    while executing
"source /home/lukas/Documents/HiWi/tmpp/myproject_prj/solution1/csynth.tcl"
    invoked from within
"hls::main /home/lukas/Documents/HiWi/tmpp/myproject_prj/solution1/csynth.tcl"
    ("uplevel" body line 1)
    invoked from within
"uplevel 1 hls::main {*}$args"
    (procedure "hls_proc" line 5)
    invoked from within
"hls_proc $argv"
Finished C synthesis.

So I assume there is something wrong with the code generated by hls4ml?

I'm using version 0.6.0 of hls4ml, but I get the same issue with the master branch and Version 0.5.0.

I hope someone can help me out or point me in the right direction, thank you in advance, Best Regards

My Code used for generating the HLS Project

from yaml import load_all
import tensorflow as tf
from tensorflow import keras
from tensorflow import Tensor
print("Tensorflow version is ", tf.__version__)
print('Keras version      : ',keras.__version__)
import numpy as np
import os, sys
from tensorflow.keras.models import Model, load_model
import hls4ml
from qkeras.utils import _add_supported_quantized_objects
from sklearn.metrics import accuracy_score

co = {}
_add_supported_quantized_objects(co)

model = load_model('./fp_model/resnet_fp_model.h5', custom_objects=co)


with open('./Dataset/X_train.npy', 'rb') as f:
    X_train = np.load(f)
with open('./Dataset/X_test.npy', 'rb') as f:
    X_test = np.load(f)
with open('./Dataset/Y_test.npy', 'rb') as f:
    Y_test = np.load(f)
with open('./Dataset/Y_train.npy', 'rb') as f:
    Y_train = np.load(f)

hls4ml.model.optimizer.OutputRoundingSaturationMode.layers = ['Activation']
hls4ml.model.optimizer.OutputRoundingSaturationMode.rounding_mode = 'AP_RND'
hls4ml.model.optimizer.OutputRoundingSaturationMode.saturation_mode = 'AP_SAT'

hls_config = hls4ml.utils.config_from_keras_model(model, granularity='name')

hls_config['Model']['Precision'] = 'ap_fixed<16,6>'
hls_config['Model']['ReuseFactor'] = 1

for Layer in hls_config['LayerName'].keys():
    hls_config['LayerName'][Layer]['Strategy'] = 'Latency'
    hls_config['LayerName'][Layer]['ReuseFactor'] = 1
#If you want best numerical performance for high-accuray models, while the default latency strategy is faster but numerically more unstable
hls_config['LayerName']['softmax']['Strategy'] = 'Stable'

hls_model = hls4ml.converters.convert_from_keras_model(model,
                                                       hls_config=hls_config,
                                                       io_type='io_stream',
                                                       backend='Vivado',
                                                       output_dir='tmpp/',
                                                       part='xczu7ev-ffvc1156-2-e')


hls_model.compile()

os.environ['PATH'] = '/tools/Xilinx/Vivado/2019.1/bin:' + os.environ['PATH']
hls_model.build(csim=False, synth=False, vsynth=False)

Y_pred = hls_model.predict(np.ascontiguousarray(X_test))
y_pred = np.argmax(Y_pred, axis = 1)
y_actual = np.argmax(Y_test, axis = 1)

accuracy = accuracy_score(y_actual, y_pred)

print("Accuracy: ", accuracy)

The output generated

Tensorflow version is  2.8.0
Keras version      :  2.8.0

Interpreting Model
Topology:
Layer name: rf_input, layer type: Input
Layer name: conv2d, layer type: Conv2D
  -> Activation (linear), layer name: conv2d
Layer name: batch_normalization, layer type: BatchNormalization
Layer name: activation, layer type: Activation
Layer name: conv2d_1, layer type: Conv2D
  -> Activation (linear), layer name: conv2d_1
Layer name: batch_normalization_1, layer type: BatchNormalization
Layer name: activation_1, layer type: Activation
Layer name: conv2d_2, layer type: Conv2D
  -> Activation (linear), layer name: conv2d_2
Layer name: batch_normalization_2, layer type: BatchNormalization
Layer name: activation_2, layer type: Activation
Layer name: conv2d_3, layer type: Conv2D
  -> Activation (linear), layer name: conv2d_3
Layer name: batch_normalization_3, layer type: BatchNormalization
Layer name: add, layer type: Add
Layer name: activation_3, layer type: Activation
Layer name: conv2d_4, layer type: Conv2D
  -> Activation (linear), layer name: conv2d_4
Layer name: batch_normalization_4, layer type: BatchNormalization
Layer name: activation_4, layer type: Activation
Layer name: conv2d_5, layer type: Conv2D
  -> Activation (linear), layer name: conv2d_5
Layer name: batch_normalization_5, layer type: BatchNormalization
Layer name: add_1, layer type: Add
Layer name: activation_5, layer type: Activation
Layer name: max_pooling2d, layer type: MaxPooling2D
Layer name: conv2d_6, layer type: Conv2D
  -> Activation (linear), layer name: conv2d_6
Layer name: batch_normalization_6, layer type: BatchNormalization
Layer name: activation_6, layer type: Activation
Layer name: conv2d_7, layer type: Conv2D
  -> Activation (linear), layer name: conv2d_7
Layer name: batch_normalization_7, layer type: BatchNormalization
Layer name: activation_7, layer type: Activation
Layer name: conv2d_8, layer type: Conv2D
  -> Activation (linear), layer name: conv2d_8
Layer name: batch_normalization_8, layer type: BatchNormalization
Layer name: add_2, layer type: Add
Layer name: activation_8, layer type: Activation
Layer name: conv2d_9, layer type: Conv2D
  -> Activation (linear), layer name: conv2d_9
Layer name: batch_normalization_9, layer type: BatchNormalization
Layer name: activation_9, layer type: Activation
Layer name: conv2d_10, layer type: Conv2D
  -> Activation (linear), layer name: conv2d_10
Layer name: batch_normalization_10, layer type: BatchNormalization
Layer name: add_3, layer type: Add
Layer name: activation_10, layer type: Activation
Layer name: max_pooling2d_1, layer type: MaxPooling2D
Layer name: conv2d_11, layer type: Conv2D
  -> Activation (linear), layer name: conv2d_11
Layer name: batch_normalization_11, layer type: BatchNormalization
Layer name: activation_11, layer type: Activation
Layer name: conv2d_12, layer type: Conv2D
  -> Activation (linear), layer name: conv2d_12
Layer name: batch_normalization_12, layer type: BatchNormalization
Layer name: activation_12, layer type: Activation
Layer name: conv2d_13, layer type: Conv2D
  -> Activation (linear), layer name: conv2d_13
Layer name: batch_normalization_13, layer type: BatchNormalization
Layer name: add_4, layer type: Add
Layer name: activation_13, layer type: Activation
Layer name: conv2d_14, layer type: Conv2D
  -> Activation (linear), layer name: conv2d_14
Layer name: batch_normalization_14, layer type: BatchNormalization
Layer name: activation_14, layer type: Activation
Layer name: conv2d_15, layer type: Conv2D
  -> Activation (linear), layer name: conv2d_15
Layer name: batch_normalization_15, layer type: BatchNormalization
Layer name: add_5, layer type: Add
Layer name: activation_15, layer type: Activation
Layer name: max_pooling2d_2, layer type: MaxPooling2D
Layer name: conv2d_16, layer type: Conv2D
  -> Activation (linear), layer name: conv2d_16
Layer name: batch_normalization_16, layer type: BatchNormalization
Layer name: activation_16, layer type: Activation
Layer name: conv2d_17, layer type: Conv2D
  -> Activation (linear), layer name: conv2d_17
Layer name: batch_normalization_17, layer type: BatchNormalization
Layer name: activation_17, layer type: Activation
Layer name: conv2d_18, layer type: Conv2D
  -> Activation (linear), layer name: conv2d_18
Layer name: batch_normalization_18, layer type: BatchNormalization
Layer name: add_6, layer type: Add
Layer name: activation_18, layer type: Activation
Layer name: conv2d_19, layer type: Conv2D
  -> Activation (linear), layer name: conv2d_19
Layer name: batch_normalization_19, layer type: BatchNormalization
Layer name: activation_19, layer type: Activation
Layer name: conv2d_20, layer type: Conv2D
  -> Activation (linear), layer name: conv2d_20
Layer name: batch_normalization_20, layer type: BatchNormalization
Layer name: add_7, layer type: Add
Layer name: activation_20, layer type: Activation
Layer name: max_pooling2d_3, layer type: MaxPooling2D
Layer name: conv2d_21, layer type: Conv2D
  -> Activation (linear), layer name: conv2d_21
Layer name: batch_normalization_21, layer type: BatchNormalization
Layer name: activation_21, layer type: Activation
Layer name: global_average_pooling2d, layer type: GlobalAveragePooling2D
Layer name: dense, layer type: Dense
  -> Activation (relu), layer name: dense
Layer name: dense_1, layer type: Dense
  -> Activation (relu), layer name: dense_1
Layer name: dense_2, layer type: Dense
  -> Activation (linear), layer name: dense_2
Layer name: softmax, layer type: Activation
Interpreting Model
Topology:
Layer name: rf_input, layer type: InputLayer, input shapes: [[None, 1024, 1, 2]], output shape: [None, 1024, 1, 2]
Layer name: conv2d, layer type: Conv2D, input shapes: [[None, 1024, 1, 2]], output shape: [None, 1024, 1, 32]
Layer name: batch_normalization, layer type: BatchNormalization, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32]
Layer name: activation, layer type: Activation, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32]
Layer name: conv2d_1, layer type: Conv2D, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32]
Layer name: batch_normalization_1, layer type: BatchNormalization, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32]
Layer name: activation_1, layer type: Activation, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32]
Layer name: conv2d_2, layer type: Conv2D, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32]
Layer name: batch_normalization_2, layer type: BatchNormalization, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32]
Layer name: activation_2, layer type: Activation, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32]
Layer name: conv2d_3, layer type: Conv2D, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32]
Layer name: batch_normalization_3, layer type: BatchNormalization, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32]
Layer name: add, layer type: Merge, input shapes: [[None, 1024, 1, 32], [None, 1024, 1, 32]], output shape: [None, 1024, 1, 32]
Layer name: activation_3, layer type: Activation, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32]
Layer name: conv2d_4, layer type: Conv2D, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32]
Layer name: batch_normalization_4, layer type: BatchNormalization, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32]
Layer name: activation_4, layer type: Activation, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32]
Layer name: conv2d_5, layer type: Conv2D, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32]
Layer name: batch_normalization_5, layer type: BatchNormalization, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32]
Layer name: add_1, layer type: Merge, input shapes: [[None, 1024, 1, 32], [None, 1024, 1, 32]], output shape: [None, 1024, 1, 32]
Layer name: activation_5, layer type: Activation, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32]
Layer name: max_pooling2d, layer type: MaxPooling2D, input shapes: [[None, 1024, 1, 32]], output shape: [None, 512, 1, 32]
Layer name: conv2d_6, layer type: Conv2D, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32]
Layer name: batch_normalization_6, layer type: BatchNormalization, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32]
Layer name: activation_6, layer type: Activation, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32]
Layer name: conv2d_7, layer type: Conv2D, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32]
Layer name: batch_normalization_7, layer type: BatchNormalization, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32]
Layer name: activation_7, layer type: Activation, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32]
Layer name: conv2d_8, layer type: Conv2D, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32]
Layer name: batch_normalization_8, layer type: BatchNormalization, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32]
Layer name: add_2, layer type: Merge, input shapes: [[None, 512, 1, 32], [None, 512, 1, 32]], output shape: [None, 512, 1, 32]
Layer name: activation_8, layer type: Activation, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32]
Layer name: conv2d_9, layer type: Conv2D, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32]
Layer name: batch_normalization_9, layer type: BatchNormalization, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32]
Layer name: activation_9, layer type: Activation, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32]
Layer name: conv2d_10, layer type: Conv2D, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32]
Layer name: batch_normalization_10, layer type: BatchNormalization, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32]
Layer name: add_3, layer type: Merge, input shapes: [[None, 512, 1, 32], [None, 512, 1, 32]], output shape: [None, 512, 1, 32]
Layer name: activation_10, layer type: Activation, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32]
Layer name: max_pooling2d_1, layer type: MaxPooling2D, input shapes: [[None, 512, 1, 32]], output shape: [None, 256, 1, 32]
Layer name: conv2d_11, layer type: Conv2D, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32]
Layer name: batch_normalization_11, layer type: BatchNormalization, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32]
Layer name: activation_11, layer type: Activation, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32]
Layer name: conv2d_12, layer type: Conv2D, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32]
Layer name: batch_normalization_12, layer type: BatchNormalization, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32]
Layer name: activation_12, layer type: Activation, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32]
Layer name: conv2d_13, layer type: Conv2D, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32]
Layer name: batch_normalization_13, layer type: BatchNormalization, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32]
Layer name: add_4, layer type: Merge, input shapes: [[None, 256, 1, 32], [None, 256, 1, 32]], output shape: [None, 256, 1, 32]
Layer name: activation_13, layer type: Activation, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32]
Layer name: conv2d_14, layer type: Conv2D, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32]
Layer name: batch_normalization_14, layer type: BatchNormalization, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32]
Layer name: activation_14, layer type: Activation, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32]
Layer name: conv2d_15, layer type: Conv2D, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32]
Layer name: batch_normalization_15, layer type: BatchNormalization, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32]
Layer name: add_5, layer type: Merge, input shapes: [[None, 256, 1, 32], [None, 256, 1, 32]], output shape: [None, 256, 1, 32]
Layer name: activation_15, layer type: Activation, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32]
Layer name: max_pooling2d_2, layer type: MaxPooling2D, input shapes: [[None, 256, 1, 32]], output shape: [None, 128, 1, 32]
Layer name: conv2d_16, layer type: Conv2D, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32]
Layer name: batch_normalization_16, layer type: BatchNormalization, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32]
Layer name: activation_16, layer type: Activation, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32]
Layer name: conv2d_17, layer type: Conv2D, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32]
Layer name: batch_normalization_17, layer type: BatchNormalization, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32]
Layer name: activation_17, layer type: Activation, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32]
Layer name: conv2d_18, layer type: Conv2D, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32]
Layer name: batch_normalization_18, layer type: BatchNormalization, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32]
Layer name: add_6, layer type: Merge, input shapes: [[None, 128, 1, 32], [None, 128, 1, 32]], output shape: [None, 128, 1, 32]
Layer name: activation_18, layer type: Activation, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32]
Layer name: conv2d_19, layer type: Conv2D, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32]
Layer name: batch_normalization_19, layer type: BatchNormalization, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32]
Layer name: activation_19, layer type: Activation, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32]
Layer name: conv2d_20, layer type: Conv2D, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32]
Layer name: batch_normalization_20, layer type: BatchNormalization, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32]
Layer name: add_7, layer type: Merge, input shapes: [[None, 128, 1, 32], [None, 128, 1, 32]], output shape: [None, 128, 1, 32]
Layer name: activation_20, layer type: Activation, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32]
Layer name: max_pooling2d_3, layer type: MaxPooling2D, input shapes: [[None, 128, 1, 32]], output shape: [None, 64, 1, 32]
Layer name: conv2d_21, layer type: Conv2D, input shapes: [[None, 64, 1, 32]], output shape: [None, 64, 1, 32]
Layer name: batch_normalization_21, layer type: BatchNormalization, input shapes: [[None, 64, 1, 32]], output shape: [None, 64, 1, 32]
Layer name: activation_21, layer type: Activation, input shapes: [[None, 64, 1, 32]], output shape: [None, 64, 1, 32]
Layer name: reshape, layer type: Reshape, input shapes: [[None, 64, 1, 32]], output shape: [None, 8, 8, 32]
Layer name: global_average_pooling2d, layer type: GlobalAveragePooling2D, input shapes: [[None, 8, 8, 32]], output shape: [None, 32]
Layer name: dense, layer type: Dense, input shapes: [[None, 32]], output shape: [None, 256]
Layer name: dense_1, layer type: Dense, input shapes: [[None, 256]], output shape: [None, 128]
Layer name: dense_2, layer type: Dense, input shapes: [[None, 128]], output shape: [None, 22]
Layer name: softmax, layer type: Softmax, input shapes: [[None, 22]], output shape: [None, 22]
Creating HLS model
WARNING: Config parameter "Precision" overwrites an existing attribute in layer "conv2d_1" (PointwiseConv2D)
WARNING: Config parameter "ReuseFactor" overwrites an existing attribute in layer "conv2d_1" (PointwiseConv2D)
WARNING: Config parameter "Strategy" overwrites an existing attribute in layer "conv2d_1" (PointwiseConv2D)
WARNING: Config parameter "Precision" overwrites an existing attribute in layer "conv2d_6" (PointwiseConv2D)
WARNING: Config parameter "ReuseFactor" overwrites an existing attribute in layer "conv2d_6" (PointwiseConv2D)
WARNING: Config parameter "Strategy" overwrites an existing attribute in layer "conv2d_6" (PointwiseConv2D)
WARNING: Config parameter "Precision" overwrites an existing attribute in layer "conv2d_11" (PointwiseConv2D)
WARNING: Config parameter "ReuseFactor" overwrites an existing attribute in layer "conv2d_11" (PointwiseConv2D)
WARNING: Config parameter "Strategy" overwrites an existing attribute in layer "conv2d_11" (PointwiseConv2D)
WARNING: Config parameter "Precision" overwrites an existing attribute in layer "conv2d_16" (PointwiseConv2D)
WARNING: Config parameter "ReuseFactor" overwrites an existing attribute in layer "conv2d_16" (PointwiseConv2D)
WARNING: Config parameter "Strategy" overwrites an existing attribute in layer "conv2d_16" (PointwiseConv2D)
Writing HLS project
Done

****** Vivado(TM) HLS - High-Level Synthesis from C, C++ and SystemC v2019.1 (64-bit)
  **** SW Build 2552052 on Fri May 24 14:47:09 MDT 2019
  **** IP Build 2548770 on Fri May 24 18:01:18 MDT 2019
    ** Copyright 1986-2019 Xilinx, Inc. All Rights Reserved.

source /tools/Xilinx/Vivado/2019.1/scripts/vivado_hls/hls.tcl -notrace
INFO: [HLS 200-10] Running '/tools/Xilinx/Vivado/2019.1/bin/unwrapped/lnx64.o/vivado_hls'
INFO: [HLS 200-10] For user 'lukas' on host 'ubuntu' (Linux_x86_64 version 5.13.0-40-generic) on Mon Apr 25 11:03:05 CEST 2022
INFO: [HLS 200-10] On os Ubuntu 20.04.2 LTS
INFO: [HLS 200-10] In directory '/home/lukas/Documents/HiWi/tmppp'
Sourcing Tcl script 'build_prj.tcl'
INFO: [HLS 200-10] Creating and opening project '/home/lukas/Documents/HiWi/tmppp/myproject_prj'.
INFO: [HLS 200-10] Adding design file 'firmware/myproject.cpp' to the project
INFO: [HLS 200-10] Adding test bench file 'myproject_test.cpp' to the project
INFO: [HLS 200-10] Adding test bench file 'firmware/weights' to the project
INFO: [HLS 200-10] Adding test bench file 'tb_data' to the project
INFO: [HLS 200-10] Creating and opening solution '/home/lukas/Documents/HiWi/tmppp/myproject_prj/solution1'.
INFO: [XFORM 203-101] Allowed max sub elements number after partition is 4096.
INFO: [XFORM 203-1161] The maximum of name length is set into 60.
INFO: [HLS 200-10] Setting target device to 'xczu7ev-ffvc1156-2-e'
INFO: [SYN 201-201] Setting up clock 'default' with a period of 5ns.
INFO: [Common 17-206] Exiting vivado_hls at Mon Apr 25 11:03:05 2022...
Synthesis report not found.
python3: firmware/nnet_utils/nnet_conv2d_stream.h:73: void nnet::conv_2d_buffer_cl(hls::stream<srcType>&, hls::stream<dstType>&, typename CONFIG_T::weight_t*, typename CONFIG_T::bias_t*) [with data_T = nnet::array<ap_fixed<16, 6>, 2>; res_T = nnet::array<ap_fixed<16, 6>, 32>; CONFIG_T = config2; typename CONFIG_T::weight_t = ap_fixed<16, 6>; typename CONFIG_T::bias_t = ap_fixed<16, 6>]: Assertion `CONFIG_T::pad_top == 0 && CONFIG_T::pad_bottom == 0 && CONFIG_T::pad_left == 0 && CONFIG_T::pad_right == 0' failed.
Aborted (core dumped)

Apr 25 '22 09:04 LordScarface

Hi @LordScarface , thanks for your post. Do you have padding in your conv2d in the Resnet model? Would you mind provide your model file also, thanks

May 02 '22 02:05 lloo099

Hi and thank you for the reply!

Yes, the Conv2D Layers have the padding set to 'same'

here is the Code used for generating the model

def resnet_block(input_data, filters, conv_size):
  x = Conv2D(filters, 1, activation=None, padding='same')(input_data)
  x = BatchNormalization()(x)
  x = Activation('relu')(x)
  x = Conv2D(filters, conv_size, activation=None, padding='same')(x)
  x = BatchNormalization()(x)
  x = Activation('relu')(x)
  x = Conv2D(filters, conv_size, activation=None, padding='same')(x)
  x = BatchNormalization()(x)
  x = Add()([x, input_data])
  x = Activation('relu')(x)
    
  y = Conv2D(filters, conv_size, activation=None, padding='same')(x)
  y = BatchNormalization()(y)
  y = Activation('relu')(y)
  y = Conv2D(filters, conv_size, activation=None, padding='same')(y)
  y = BatchNormalization()(y)    
  y = Add()([y, x])
  y = Activation('relu')(y)
  
  z = MaxPooling2D(2, strides = (2,1), padding = 'same') (y)
  return z

num_resnet_blocks = 4
num_filters = 32
kernel_size = 5,1

rf_input = Input(shape=input_shp, name = 'rf_input')

x = Conv2D(num_filters, (kernel_size), activation=None, padding='same')(rf_input)
x = BatchNormalization()(x)
x = Activation('relu')(x)

for i in range(num_resnet_blocks):
    x = resnet_block(x, num_filters, (kernel_size))

x = Conv2D(num_filters, (kernel_size), activation=None, padding = 'same')(x)
x = BatchNormalization()(x)
x = Activation('relu')(x)

# use if number of resnet blocks = 6
#x = Reshape((4,4,num_filters), input_shape = (16,1,num_filters)) (x)

# use if number of resent blocks = 4
x = Reshape((8,8,num_filters), input_shape = (32,1,num_filters)) (x)

x = GlobalAveragePooling2D()(x)
dense_1 = Dense(256, activation='relu')(x)
dropout_1 = Dropout(0.5)(dense_1)
dense_2 = Dense(128, activation='relu')(dropout_1)
dropout_2 = Dropout(0.5)(dense_2)
dense_3 = Dense(num_classes)(dropout_2)          
softmax = Activation('softmax', name = 'softmax')(dense_3)

optimizer= Adam(learning_rate=0.00050)
model = keras.Model(rf_input, softmax)
model.compile(loss='categorical_crossentropy', metrics=["accuracy"])

The Model Summary

Model: "model"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 rf_input (InputLayer)          [(None, 1024, 1, 2)  0           []                               
                                ]                                                                 
                                                                                                  
 conv2d (Conv2D)                (None, 1024, 1, 32)  352         ['rf_input[0][0]']               
                                                                                                  
 batch_normalization (BatchNorm  (None, 1024, 1, 32)  128        ['conv2d[0][0]']                 
 alization)                                                                                       
                                                                                                  
 activation (Activation)        (None, 1024, 1, 32)  0           ['batch_normalization[0][0]']    
                                                                                                  
 conv2d_1 (Conv2D)              (None, 1024, 1, 32)  1056        ['activation[0][0]']             
                                                                                                  
 batch_normalization_1 (BatchNo  (None, 1024, 1, 32)  128        ['conv2d_1[0][0]']               
 rmalization)                                                                                     
                                                                                                  
 activation_1 (Activation)      (None, 1024, 1, 32)  0           ['batch_normalization_1[0][0]']  
                                                                                                  
 conv2d_2 (Conv2D)              (None, 1024, 1, 32)  5152        ['activation_1[0][0]']           
                                                                                                  
 batch_normalization_2 (BatchNo  (None, 1024, 1, 32)  128        ['conv2d_2[0][0]']               
 rmalization)                                                                                     
                                                                                                  
 activation_2 (Activation)      (None, 1024, 1, 32)  0           ['batch_normalization_2[0][0]']  
                                                                                                  
 conv2d_3 (Conv2D)              (None, 1024, 1, 32)  5152        ['activation_2[0][0]']           
                                                                                                  
 batch_normalization_3 (BatchNo  (None, 1024, 1, 32)  128        ['conv2d_3[0][0]']               
 rmalization)                                                                                     
                                                                                                  
 add (Add)                      (None, 1024, 1, 32)  0           ['batch_normalization_3[0][0]',  
                                                                  'activation[0][0]']             
                                                                                                  
 activation_3 (Activation)      (None, 1024, 1, 32)  0           ['add[0][0]']                    
                                                                                                  
 conv2d_4 (Conv2D)              (None, 1024, 1, 32)  5152        ['activation_3[0][0]']           
                                                                                                  
 batch_normalization_4 (BatchNo  (None, 1024, 1, 32)  128        ['conv2d_4[0][0]']               
 rmalization)                                                                                     
                                                                                                  
 activation_4 (Activation)      (None, 1024, 1, 32)  0           ['batch_normalization_4[0][0]']  
                                                                                                  
 conv2d_5 (Conv2D)              (None, 1024, 1, 32)  5152        ['activation_4[0][0]']           
                                                                                                  
 batch_normalization_5 (BatchNo  (None, 1024, 1, 32)  128        ['conv2d_5[0][0]']               
 rmalization)                                                                                     
                                                                                                  
 add_1 (Add)                    (None, 1024, 1, 32)  0           ['batch_normalization_5[0][0]',  
                                                                  'activation_3[0][0]']           
                                                                                                  
 activation_5 (Activation)      (None, 1024, 1, 32)  0           ['add_1[0][0]']                  
                                                                                                  
 max_pooling2d (MaxPooling2D)   (None, 512, 1, 32)   0           ['activation_5[0][0]']           
                                                                                                  
 conv2d_6 (Conv2D)              (None, 512, 1, 32)   1056        ['max_pooling2d[0][0]']          
                                                                                                  
 batch_normalization_6 (BatchNo  (None, 512, 1, 32)  128         ['conv2d_6[0][0]']               
 rmalization)                                                                                     
                                                                                                  
 activation_6 (Activation)      (None, 512, 1, 32)   0           ['batch_normalization_6[0][0]']  
                                                                                                  
 conv2d_7 (Conv2D)              (None, 512, 1, 32)   5152        ['activation_6[0][0]']           
                                                                                                  
 batch_normalization_7 (BatchNo  (None, 512, 1, 32)  128         ['conv2d_7[0][0]']               
 rmalization)                                                                                     
                                                                                                  
 activation_7 (Activation)      (None, 512, 1, 32)   0           ['batch_normalization_7[0][0]']  
                                                                                                  
 conv2d_8 (Conv2D)              (None, 512, 1, 32)   5152        ['activation_7[0][0]']           
                                                                                                  
 batch_normalization_8 (BatchNo  (None, 512, 1, 32)  128         ['conv2d_8[0][0]']               
 rmalization)                                                                                     
                                                                                                  
 add_2 (Add)                    (None, 512, 1, 32)   0           ['batch_normalization_8[0][0]',  
                                                                  'max_pooling2d[0][0]']          
                                                                                                  
 activation_8 (Activation)      (None, 512, 1, 32)   0           ['add_2[0][0]']                  
                                                                                                  
 conv2d_9 (Conv2D)              (None, 512, 1, 32)   5152        ['activation_8[0][0]']           
                                                                                                  
 batch_normalization_9 (BatchNo  (None, 512, 1, 32)  128         ['conv2d_9[0][0]']               
 rmalization)                                                                                     
                                                                                                  
 activation_9 (Activation)      (None, 512, 1, 32)   0           ['batch_normalization_9[0][0]']  
                                                                                                  
 conv2d_10 (Conv2D)             (None, 512, 1, 32)   5152        ['activation_9[0][0]']           
                                                                                                  
 batch_normalization_10 (BatchN  (None, 512, 1, 32)  128         ['conv2d_10[0][0]']              
 ormalization)                                                                                    
                                                                                                  
 add_3 (Add)                    (None, 512, 1, 32)   0           ['batch_normalization_10[0][0]', 
                                                                  'activation_8[0][0]']           
                                                                                                  
 activation_10 (Activation)     (None, 512, 1, 32)   0           ['add_3[0][0]']                  
                                                                                                  
 max_pooling2d_1 (MaxPooling2D)  (None, 256, 1, 32)  0           ['activation_10[0][0]']          
                                                                                                  
 conv2d_11 (Conv2D)             (None, 256, 1, 32)   1056        ['max_pooling2d_1[0][0]']        
                                                                                                  
 batch_normalization_11 (BatchN  (None, 256, 1, 32)  128         ['conv2d_11[0][0]']              
 ormalization)                                                                                    
                                                                                                  
 activation_11 (Activation)     (None, 256, 1, 32)   0           ['batch_normalization_11[0][0]'] 
                                                                                                  
 conv2d_12 (Conv2D)             (None, 256, 1, 32)   5152        ['activation_11[0][0]']          
                                                                                                  
 batch_normalization_12 (BatchN  (None, 256, 1, 32)  128         ['conv2d_12[0][0]']              
 ormalization)                                                                                    
                                                                                                  
 activation_12 (Activation)     (None, 256, 1, 32)   0           ['batch_normalization_12[0][0]'] 
                                                                                                  
 conv2d_13 (Conv2D)             (None, 256, 1, 32)   5152        ['activation_12[0][0]']          
                                                                                                  
 batch_normalization_13 (BatchN  (None, 256, 1, 32)  128         ['conv2d_13[0][0]']              
 ormalization)                                                                                    
                                                                                                  
 add_4 (Add)                    (None, 256, 1, 32)   0           ['batch_normalization_13[0][0]', 
                                                                  'max_pooling2d_1[0][0]']        
                                                                                                  
 activation_13 (Activation)     (None, 256, 1, 32)   0           ['add_4[0][0]']                  
                                                                                                  
 conv2d_14 (Conv2D)             (None, 256, 1, 32)   5152        ['activation_13[0][0]']          
                                                                                                  
 batch_normalization_14 (BatchN  (None, 256, 1, 32)  128         ['conv2d_14[0][0]']              
 ormalization)                                                                                    
                                                                                                  
 activation_14 (Activation)     (None, 256, 1, 32)   0           ['batch_normalization_14[0][0]'] 
                                                                                                  
 conv2d_15 (Conv2D)             (None, 256, 1, 32)   5152        ['activation_14[0][0]']          
                                                                                                  
 batch_normalization_15 (BatchN  (None, 256, 1, 32)  128         ['conv2d_15[0][0]']              
 ormalization)                                                                                    
                                                                                                  
 add_5 (Add)                    (None, 256, 1, 32)   0           ['batch_normalization_15[0][0]', 
                                                                  'activation_13[0][0]']          
                                                                                                  
 activation_15 (Activation)     (None, 256, 1, 32)   0           ['add_5[0][0]']                  
                                                                                                  
 max_pooling2d_2 (MaxPooling2D)  (None, 128, 1, 32)  0           ['activation_15[0][0]']          
                                                                                                  
 conv2d_16 (Conv2D)             (None, 128, 1, 32)   1056        ['max_pooling2d_2[0][0]']        
                                                                                                  
 batch_normalization_16 (BatchN  (None, 128, 1, 32)  128         ['conv2d_16[0][0]']              
 ormalization)                                                                                    
                                                                                                  
 activation_16 (Activation)     (None, 128, 1, 32)   0           ['batch_normalization_16[0][0]'] 
                                                                                                  
 conv2d_17 (Conv2D)             (None, 128, 1, 32)   5152        ['activation_16[0][0]']          
                                                                                                  
 batch_normalization_17 (BatchN  (None, 128, 1, 32)  128         ['conv2d_17[0][0]']              
 ormalization)                                                                                    
                                                                                                  
 activation_17 (Activation)     (None, 128, 1, 32)   0           ['batch_normalization_17[0][0]'] 
                                                                                                  
 conv2d_18 (Conv2D)             (None, 128, 1, 32)   5152        ['activation_17[0][0]']          
                                                                                                  
 batch_normalization_18 (BatchN  (None, 128, 1, 32)  128         ['conv2d_18[0][0]']              
 ormalization)                                                                                    
                                                                                                  
 add_6 (Add)                    (None, 128, 1, 32)   0           ['batch_normalization_18[0][0]', 
                                                                  'max_pooling2d_2[0][0]']        
                                                                                                  
 activation_18 (Activation)     (None, 128, 1, 32)   0           ['add_6[0][0]']                  
                                                                                                  
 conv2d_19 (Conv2D)             (None, 128, 1, 32)   5152        ['activation_18[0][0]']          
                                                                                                  
 batch_normalization_19 (BatchN  (None, 128, 1, 32)  128         ['conv2d_19[0][0]']              
 ormalization)                                                                                    
                                                                                                  
 activation_19 (Activation)     (None, 128, 1, 32)   0           ['batch_normalization_19[0][0]'] 
                                                                                                  
 conv2d_20 (Conv2D)             (None, 128, 1, 32)   5152        ['activation_19[0][0]']          
                                                                                                  
 batch_normalization_20 (BatchN  (None, 128, 1, 32)  128         ['conv2d_20[0][0]']              
 ormalization)                                                                                    
                                                                                                  
 add_7 (Add)                    (None, 128, 1, 32)   0           ['batch_normalization_20[0][0]', 
                                                                  'activation_18[0][0]']          
                                                                                                  
 activation_20 (Activation)     (None, 128, 1, 32)   0           ['add_7[0][0]']                  
                                                                                                  
 max_pooling2d_3 (MaxPooling2D)  (None, 64, 1, 32)   0           ['activation_20[0][0]']          
                                                                                                  
 conv2d_21 (Conv2D)             (None, 64, 1, 32)    5152        ['max_pooling2d_3[0][0]']        
                                                                                                  
 batch_normalization_21 (BatchN  (None, 64, 1, 32)   128         ['conv2d_21[0][0]']              
 ormalization)                                                                                    
                                                                                                  
 activation_21 (Activation)     (None, 64, 1, 32)    0           ['batch_normalization_21[0][0]'] 
                                                                                                  
 reshape (Reshape)              (None, 8, 8, 32)     0           ['activation_21[0][0]']          
                                                                                                  
 global_average_pooling2d (Glob  (None, 32)          0           ['reshape[0][0]']                
 alAveragePooling2D)                                                                              
                                                                                                  
 dense (Dense)                  (None, 256)          8448        ['global_average_pooling2d[0][0]'
                                                                 ]                                
                                                                                                  
 dropout (Dropout)              (None, 256)          0           ['dense[0][0]']                  
                                                                                                  
 dense_1 (Dense)                (None, 128)          32896       ['dropout[0][0]']                
                                                                                                  
 dropout_1 (Dropout)            (None, 128)          0           ['dense_1[0][0]']                
                                                                                                  
 dense_2 (Dense)                (None, 22)           2838        ['dropout_1[0][0]']              
                                                                                                  
 softmax (Activation)           (None, 22)           0           ['dense_2[0][0]']                
                                                                                                  
==================================================================================================
Total params: 139,158
Trainable params: 137,750
Non-trainable params: 1,408
__________________________________________________________________________________________________
None

This is the trained model (trained only on the first 200k Samples of the Dataset)

May 02 '22 08:05 LordScarface

Thanks, I am not sure the 'same' padding means 0 padding in your models. Normally, conv2d of hls4ml has zero padding but can configure it. I suggest you test a small model at first.

May 04 '22 08:05 lloo099

Thank you for the input, I switched the model to use Conv1D instead of Conv2D but the issue remained. I was able to fix it by removing the 'same' padding from the MaxPooling2D (now MaxPooling1D) Layers of the ResNet Blocks. Maybe there is an issue with 'same' padding for the pooling Layers?

Anyway, model passes synthesis now but Resource Usage seems high (Target ZCU104) :

Strategy is Resource and ReuseFactor has been increased to 16 for the large Dense Layer with 32.896 parameters.

Is this to be expected or is it too high?

May 05 '22 17:05 LordScarface

As far as I know, the resource strategy is not available now. In this way, it mainly supports the latency strategy. If you want to get a good balance between latency and resources. I suggest that you may improve the coding from the hardware level. Hls4ml is more friendly with lightweight models.

May 06 '22 01:05 lloo099

The Resource Strategy has worked for me in the past, I saw that in #534 you were able to get past synthesis with the VGG-16 model, when choosing the Latency Strategy I get issues with Layers containing more that 4096 parameters, how did you adress that?

My goal right now is to implement the model fully parallel without any optimizations and then to see how much things can be improved by quantization, pruning, etc.

May 06 '22 08:05 LordScarface

The Resource Strategy has worked for me in the past, I saw that in #534 you were able to get past synthesis with the VGG-16 model, when choosing the Latency Strategy I get issues with Layers containing more that 4096 parameters, how did you adress that?

My goal right now is to implement the model fully parallel without any optimizations and then to see how much things can be improved by quantization, pruning, etc.

Nice and u can share your resource strategy for ResNet if it's possible. Because it's too much array partition Here. You may consider to modify it or just comment.

May 06 '22 11:05 lloo099

For the Resource Strategy I just replaced this:

hls_config['Model']['Precision'] = 'ap_fixed<16,6>'
hls_config['Model']['ReuseFactor'] = 1

for Layer in hls_config['LayerName'].keys():
    hls_config['LayerName'][Layer]['Strategy'] = 'Latency'
    hls_config['LayerName'][Layer]['ReuseFactor'] = 1
#If you want best numerical performance for high-accuray models, while the default latency strategy is faster but numerically more unstable
hls_config['LayerName']['softmax']['Strategy'] = 'Stable'

with this:

hls_config['Model']['Strategy'] = 'Resource'
hls_config['LayerName']['softmax']['Strategy'] = 'Stable'
hls_config['LayerName']['dense_28']['ReuseFactor'] = 16

I tried the Latency Strategy with commenting #pragma HLS ARRAY_PARTITION variable=mult complete as you suggested, but it did not help me, now I keep running out of memory during synthesis with $RDI PROG" "@$ , is this something you have experienced when working with larger models? Are 48 GB of RAM not enough?

May 12 '22 13:05 LordScarface

For the Resource Strategy I just replaced this:
hls_config['Model']['Precision'] = 'ap_fixed<16,6>'
hls_config['Model']['ReuseFactor'] = 1

for Layer in hls_config['LayerName'].keys():
    hls_config['LayerName'][Layer]['Strategy'] = 'Latency'
    hls_config['LayerName'][Layer]['ReuseFactor'] = 1
#If you want best numerical performance for high-accuray models, while the default latency strategy is faster but numerically more unstable
hls_config['LayerName']['softmax']['Strategy'] = 'Stable'
with this:
hls_config['Model']['Strategy'] = 'Resource'
hls_config['LayerName']['softmax']['Strategy'] = 'Stable'
hls_config['LayerName']['dense_28']['ReuseFactor'] = 16
I tried the Latency Strategy with commenting #pragma HLS ARRAY_PARTITION variable=mult complete as you suggested, but it did not help me, now I keep running out of memory during synthesis with $RDI PROG" "@$ , is this something you have experienced when working with larger models? Are 48 GB of RAM not enough?

Ok, thanks. I comment this because the size of layer is over 4096 but not sure your problems. Yes, I try to compile VGG-16 which encounter same memory problem. So I think we should reduce precision to 8bits. In this way, it is possible to deploy.

May 12 '22 16:05 lloo099

For the Resource Strategy I just replaced this:
hls_config['Model']['Precision'] = 'ap_fixed<16,6>'
hls_config['Model']['ReuseFactor'] = 1

for Layer in hls_config['LayerName'].keys():
    hls_config['LayerName'][Layer]['Strategy'] = 'Latency'
    hls_config['LayerName'][Layer]['ReuseFactor'] = 1
#If you want best numerical performance for high-accuray models, while the default latency strategy is faster but numerically more unstable
hls_config['LayerName']['softmax']['Strategy'] = 'Stable'
with this:
hls_config['Model']['Strategy'] = 'Resource'
hls_config['LayerName']['softmax']['Strategy'] = 'Stable'
hls_config['LayerName']['dense_28']['ReuseFactor'] = 16
I tried the Latency Strategy with commenting #pragma HLS ARRAY_PARTITION variable=mult complete as you suggested, but it did not help me, now I keep running out of memory during synthesis with $RDI PROG" "@$ , is this something you have experienced when working with larger models? Are 48 GB of RAM not enough?

Btw, you can also try vitis_hls 2020.2.

May 13 '22 04:05 lloo099

Hi, I am also trying to work with the ResNet architecture and Inception architectures. While the unquantized models synthesize well with the hls4ml, there is an accuracy drop for quantized models for the above-mentioned architectures. It would be of great help if there be any insights into the issue that I am facing. #587

Jul 22 '22 12:07 wilfredkisku

hls4ml hls4ml copied to clipboard

Issue with predict and HLS Compilation

hls4ml
hls4ml copied to clipboard