When strategy set to 'Resource', Vivado backend behaves abnormal while Vitis backend still right
In the official example ”part6_cnns“(https://github.com/fastmachinelearning/hls4ml-tutorial/blob/main/part6_cnns.ipynb), I used keras and qkeras to train the model, and after converting it using the "Vitis" backend, I got the correct prediction results. Then, when I switched to the "Vivado" backend, the keras model still performed correctly, but the qkeras model conversion went wrong and the accuracy was significantly reduced.
Here is part of transform code:
# First, the baseline model
hls_config = hls4ml.utils.config_from_keras_model(
model, granularity='name', backend='Vivado', default_precision='ap_fixed<16,6>'
)
hls_config['Model']['Strategy'] = 'Resource'
plotting.print_dict(hls_config)
hls_model = hls4ml.converters.convert_from_keras_model(
model,
hls_config=hls_config,
# backend='Vitis',
backend='Vivado',
output_dir='model_1/hls4ml_prj',
part='xcu250-figd2104-2L-e',
io_type='io_stream',
)
hls_model.compile()
# Then the QKeras model
hls_config_q = hls4ml.utils.config_from_keras_model(qmodel, granularity='name', backend='Vivado')
hls_config_q['Model']['Strategy'] = 'Resource'
plotting.print_dict(hls_config_q)
hls_model_q = hls4ml.converters.convert_from_keras_model(
qmodel, hls_config=hls_config_q, output_dir='quantized_pruned_cnn', backend='Vivado', io_type='io_stream'
)
hls_model_q.compile()
ROC Plotting results: Accuracy Keras: 0.8876666666666667 Accuracy hls4ml: 0.887 Accuracy Keras: 0.834 Accuracy hls4ml: 0.19966666666666666
Can you try passing default_precision='ap_fixed<16,6>' to the QKeras model? It looks like the accumulators may not be getting the correct precision for some reason. A conservative approach is to start with the default of 16,6. hls4ml can still infer the quantized weights even if default_precision is provided.
@bo3z Thank you for suggestion! After passing default_precision='ap_fixed<16,6>' to the QKeras model, the performance did not improve.
# Then the QKeras model
hls_config_q = hls4ml.utils.config_from_keras_model(qmodel, granularity='name', backend='Vivado', default_precision='ap_fixed<16,6>')
hls_config_q['Model']['Strategy'] = 'Resource'
plotting.print_dict(hls_config_q)
hls_model_q = hls4ml.converters.convert_from_keras_model(
qmodel, hls_config=hls_config_q, output_dir='quantized_pruned_cnn', backend='Vivado', io_type='io_stream'
)
hls_model_q.compile()
Accuracy Keras: 0.8903333333333333
Accuracy hls4ml: 0.8896666666666667
Accuracy Keras: 0.86
Accuracy hls4ml: 0.195
For the printed configuration dictionary, I found the only difference between with default precision and without default precision.
Model
Precision
default: ap_fixed<16,6> # <-- with default precision
default: fixed<16,6> # <-- without default precision
ReuseFactor: 1
Strategy: Resource
BramFactor: 1000000000
TraceOutput: False
LayerName
input_2
Trace: False
Precision
result: auto
fused_convbn_0
Trace: False
Precision
result: auto
weight: fixed<6,1,TRN,WRAP,0>
bias: fixed<6,1,TRN,WRAP,0>
accum: auto
ReuseFactor: 1
ParallelizationFactor:1
ConvImplementation:LineBuffer
fused_convbn_0_linear
Trace: False
Precision
result: auto
table: fixed<18,8,TRN,WRAP,0>
ReuseFactor: 1
TableSize: 1024
conv_act_0
Trace: False
Precision
result: ufixed<6,0,RND_CONV,SAT,0>
table: fixed<18,8,TRN,WRAP,0>
ReuseFactor: 1
TableSize: 1024
pool_0
Trace: False
Precision
result: auto
accum: auto
ReuseFactor: 1
ConvImplementation:LineBuffer
fused_convbn_1
Trace: False
Precision
result: auto
weight: fixed<6,1,TRN,WRAP,0>
bias: fixed<6,1,TRN,WRAP,0>
accum: auto
ReuseFactor: 1
ParallelizationFactor:1
ConvImplementation:LineBuffer
fused_convbn_1_linear
Trace: False
Precision
result: auto
table: fixed<18,8,TRN,WRAP,0>
ReuseFactor: 1
TableSize: 1024
conv_act_1
Trace: False
Precision
result: ufixed<6,0,RND_CONV,SAT,0>
table: fixed<18,8,TRN,WRAP,0>
ReuseFactor: 1
TableSize: 1024
pool_1
Trace: False
Precision
result: auto
accum: auto
ReuseFactor: 1
ConvImplementation:LineBuffer
fused_convbn_2
Trace: False
Precision
result: auto
weight: fixed<6,1,TRN,WRAP,0>
bias: fixed<6,1,TRN,WRAP,0>
accum: auto
ReuseFactor: 1
ParallelizationFactor:1
ConvImplementation:LineBuffer
fused_convbn_2_linear
Trace: False
Precision
result: auto
table: fixed<18,8,TRN,WRAP,0>
ReuseFactor: 1
TableSize: 1024
conv_act_2
Trace: False
Precision
result: ufixed<6,0,RND_CONV,SAT,0>
table: fixed<18,8,TRN,WRAP,0>
ReuseFactor: 1
TableSize: 1024
pool_2
Trace: False
Precision
result: auto
accum: auto
ReuseFactor: 1
ConvImplementation:LineBuffer
flatten_1
Trace: False
Precision
result: auto
dense_0
Trace: False
Precision
result: auto
weight: fixed<6,1,TRN,WRAP,0>
bias: auto
accum: auto
ReuseFactor: 1
dense_0_linear
Trace: False
Precision
result: auto
table: fixed<18,8,TRN,WRAP,0>
ReuseFactor: 1
TableSize: 1024
bn_dense_0
Trace: False
Precision
result: auto
scale: auto
bias: auto
ReuseFactor: 1
dense_act_0
Trace: False
Precision
result: ufixed<6,0,RND_CONV,SAT,0>
table: fixed<18,8,TRN,WRAP,0>
ReuseFactor: 1
TableSize: 1024
dense_1
Trace: False
Precision
result: auto
weight: fixed<6,1,TRN,WRAP,0>
bias: auto
accum: auto
ReuseFactor: 1
dense_1_linear
Trace: False
Precision
result: auto
table: fixed<18,8,TRN,WRAP,0>
ReuseFactor: 1
TableSize: 1024
bn_dense_1
Trace: False
Precision
result: auto
scale: auto
bias: auto
ReuseFactor: 1
dense_act_1
Trace: False
Precision
result: ufixed<6,0,RND_CONV,SAT,0>
table: fixed<18,8,TRN,WRAP,0>
ReuseFactor: 1
TableSize: 1024
output_dense
Trace: False
Precision
result: auto
weight: auto
bias: auto
accum: auto
ReuseFactor: 1
output_dense_linear
Trace: False
Precision
result: auto
table: fixed<18,8,TRN,WRAP,0>
ReuseFactor: 1
TableSize: 1024
output_softmax
Trace: False
Precision
result: auto
table: fixed<18,8,TRN,WRAP,0>
exp_table: fixed<18,8,RND,SAT,0>
inv_table: fixed<18,8,RND,SAT,0>
ReuseFactor: 1
TableSize: 1024
Implementation: stable
Skip: False
That seems rather wrong --- what happens if you try increasing the precision? Example <18, 8> or <32, 10>. To see if this is some bug in parsing the model or just that that the default precision isn't sufficient? You could also try removing the softmax if you only need the logits, rather than the actual probabilities, by setting the softmax strategy to argmax (to rule out any issues with softmax).