hls4ml icon indicating copy to clipboard operation
hls4ml copied to clipboard

When strategy set to 'Resource', Vivado backend behaves abnormal while Vitis backend still right

Open Irisaka opened this issue 5 months ago • 3 comments

In the official example ”part6_cnns“(https://github.com/fastmachinelearning/hls4ml-tutorial/blob/main/part6_cnns.ipynb), I used keras and qkeras to train the model, and after converting it using the "Vitis" backend, I got the correct prediction results. Then, when I switched to the "Vivado" backend, the keras model still performed correctly, but the qkeras model conversion went wrong and the accuracy was significantly reduced.

Here is part of transform code:

# First, the baseline model
hls_config = hls4ml.utils.config_from_keras_model(
    model, granularity='name', backend='Vivado', default_precision='ap_fixed<16,6>'
)
hls_config['Model']['Strategy'] = 'Resource'
plotting.print_dict(hls_config)

hls_model = hls4ml.converters.convert_from_keras_model(
    model,
    hls_config=hls_config,
    # backend='Vitis',
    backend='Vivado',
    output_dir='model_1/hls4ml_prj',
    part='xcu250-figd2104-2L-e',
    io_type='io_stream',
)
hls_model.compile()
# Then the QKeras model
hls_config_q = hls4ml.utils.config_from_keras_model(qmodel, granularity='name', backend='Vivado')
hls_config_q['Model']['Strategy'] = 'Resource'
plotting.print_dict(hls_config_q)

hls_model_q = hls4ml.converters.convert_from_keras_model(
    qmodel, hls_config=hls_config_q, output_dir='quantized_pruned_cnn', backend='Vivado', io_type='io_stream'
)

hls_model_q.compile()

ROC Plotting results: Accuracy Keras: 0.8876666666666667 Accuracy hls4ml: 0.887 Accuracy Keras: 0.834 Accuracy hls4ml: 0.19966666666666666

Image Image Looking forward to your reply!

Irisaka avatar Jul 15 '25 03:07 Irisaka

Can you try passing default_precision='ap_fixed<16,6>' to the QKeras model? It looks like the accumulators may not be getting the correct precision for some reason. A conservative approach is to start with the default of 16,6. hls4ml can still infer the quantized weights even if default_precision is provided.

bo3z avatar Jul 18 '25 12:07 bo3z

@bo3z Thank you for suggestion! After passing default_precision='ap_fixed<16,6>' to the QKeras model, the performance did not improve.

# Then the QKeras model
hls_config_q = hls4ml.utils.config_from_keras_model(qmodel, granularity='name', backend='Vivado', default_precision='ap_fixed<16,6>')

hls_config_q['Model']['Strategy'] = 'Resource'

plotting.print_dict(hls_config_q)

hls_model_q = hls4ml.converters.convert_from_keras_model(
    qmodel, hls_config=hls_config_q, output_dir='quantized_pruned_cnn', backend='Vivado', io_type='io_stream'
)

hls_model_q.compile()
Accuracy Keras:  0.8903333333333333
Accuracy hls4ml: 0.8896666666666667
Accuracy Keras:  0.86
Accuracy hls4ml: 0.195
Image

For the printed configuration dictionary, I found the only difference between with default precision and without default precision.

Model
  Precision
    default:         ap_fixed<16,6>  #  <-- with default precision
    default:         fixed<16,6>       #  <-- without default precision
  ReuseFactor:       1
  Strategy:          Resource
  BramFactor:        1000000000
  TraceOutput:       False
LayerName
  input_2
    Trace:           False
    Precision
      result:        auto
  fused_convbn_0
    Trace:           False
    Precision
      result:        auto
      weight:        fixed<6,1,TRN,WRAP,0>
      bias:          fixed<6,1,TRN,WRAP,0>
      accum:         auto
    ReuseFactor:     1
    ParallelizationFactor:1
    ConvImplementation:LineBuffer
  fused_convbn_0_linear
    Trace:           False
    Precision
      result:        auto
      table:         fixed<18,8,TRN,WRAP,0>
    ReuseFactor:     1
    TableSize:       1024
  conv_act_0
    Trace:           False
    Precision
      result:        ufixed<6,0,RND_CONV,SAT,0>
      table:         fixed<18,8,TRN,WRAP,0>
    ReuseFactor:     1
    TableSize:       1024
  pool_0
    Trace:           False
    Precision
      result:        auto
      accum:         auto
    ReuseFactor:     1
    ConvImplementation:LineBuffer
  fused_convbn_1
    Trace:           False
    Precision
      result:        auto
      weight:        fixed<6,1,TRN,WRAP,0>
      bias:          fixed<6,1,TRN,WRAP,0>
      accum:         auto
    ReuseFactor:     1
    ParallelizationFactor:1
    ConvImplementation:LineBuffer
  fused_convbn_1_linear
    Trace:           False
    Precision
      result:        auto
      table:         fixed<18,8,TRN,WRAP,0>
    ReuseFactor:     1
    TableSize:       1024
  conv_act_1
    Trace:           False
    Precision
      result:        ufixed<6,0,RND_CONV,SAT,0>
      table:         fixed<18,8,TRN,WRAP,0>
    ReuseFactor:     1
    TableSize:       1024
  pool_1
    Trace:           False
    Precision
      result:        auto
      accum:         auto
    ReuseFactor:     1
    ConvImplementation:LineBuffer
  fused_convbn_2
    Trace:           False
    Precision
      result:        auto
      weight:        fixed<6,1,TRN,WRAP,0>
      bias:          fixed<6,1,TRN,WRAP,0>
      accum:         auto
    ReuseFactor:     1
    ParallelizationFactor:1
    ConvImplementation:LineBuffer
  fused_convbn_2_linear
    Trace:           False
    Precision
      result:        auto
      table:         fixed<18,8,TRN,WRAP,0>
    ReuseFactor:     1
    TableSize:       1024
  conv_act_2
    Trace:           False
    Precision
      result:        ufixed<6,0,RND_CONV,SAT,0>
      table:         fixed<18,8,TRN,WRAP,0>
    ReuseFactor:     1
    TableSize:       1024
  pool_2
    Trace:           False
    Precision
      result:        auto
      accum:         auto
    ReuseFactor:     1
    ConvImplementation:LineBuffer
  flatten_1
    Trace:           False
    Precision
      result:        auto
  dense_0
    Trace:           False
    Precision
      result:        auto
      weight:        fixed<6,1,TRN,WRAP,0>
      bias:          auto
      accum:         auto
    ReuseFactor:     1
  dense_0_linear
    Trace:           False
    Precision
      result:        auto
      table:         fixed<18,8,TRN,WRAP,0>
    ReuseFactor:     1
    TableSize:       1024
  bn_dense_0
    Trace:           False
    Precision
      result:        auto
      scale:         auto
      bias:          auto
    ReuseFactor:     1
  dense_act_0
    Trace:           False
    Precision
      result:        ufixed<6,0,RND_CONV,SAT,0>
      table:         fixed<18,8,TRN,WRAP,0>
    ReuseFactor:     1
    TableSize:       1024
  dense_1
    Trace:           False
    Precision
      result:        auto
      weight:        fixed<6,1,TRN,WRAP,0>
      bias:          auto
      accum:         auto
    ReuseFactor:     1
  dense_1_linear
    Trace:           False
    Precision
      result:        auto
      table:         fixed<18,8,TRN,WRAP,0>
    ReuseFactor:     1
    TableSize:       1024
  bn_dense_1
    Trace:           False
    Precision
      result:        auto
      scale:         auto
      bias:          auto
    ReuseFactor:     1
  dense_act_1
    Trace:           False
    Precision
      result:        ufixed<6,0,RND_CONV,SAT,0>
      table:         fixed<18,8,TRN,WRAP,0>
    ReuseFactor:     1
    TableSize:       1024
  output_dense
    Trace:           False
    Precision
      result:        auto
      weight:        auto
      bias:          auto
      accum:         auto
    ReuseFactor:     1
  output_dense_linear
    Trace:           False
    Precision
      result:        auto
      table:         fixed<18,8,TRN,WRAP,0>
    ReuseFactor:     1
    TableSize:       1024
  output_softmax
    Trace:           False
    Precision
      result:        auto
      table:         fixed<18,8,TRN,WRAP,0>
      exp_table:     fixed<18,8,RND,SAT,0>
      inv_table:     fixed<18,8,RND,SAT,0>
    ReuseFactor:     1
    TableSize:       1024
    Implementation:  stable
    Skip:            False

Irisaka avatar Jul 21 '25 13:07 Irisaka

That seems rather wrong --- what happens if you try increasing the precision? Example <18, 8> or <32, 10>. To see if this is some bug in parsing the model or just that that the default precision isn't sufficient? You could also try removing the softmax if you only need the logits, rather than the actual probabilities, by setting the softmax strategy to argmax (to rule out any issues with softmax).

bo3z avatar Jul 21 '25 17:07 bo3z