Some FINN activation export tests randomly failing

Open maltanar opened this issue 5 years ago • 1 comments

We are seeing some FINN activation export tests randomly failing, first mentioned here on the FINN repo:

https://github.com/Xilinx/finn/pull/134#issuecomment-634714886

Copy pasting one of the failed cases we are observing here:

test_brevitas_act_export_qhardtanh_scaled[PARAMETER-1.0--0.9921875-True-8]

abits:  8  | narrow_range:  True  | min_val:  -0.9921875  | max_val:  1.0
layer scale:  tensor(0.0111)
export scale:  tensor(0.0111)
input:   0.2118,  -0.2156,  -0.5805,   0.7864,   0.4232,  -0.0256,  -0.9301,  -0.8003,  -0.5085,   0.9171,   0.8515,  -0.5275,  -0.2147,   0.1295,   0.0240
prod :   0.2101,  -0.2212,  -0.5750,   0.7852,   0.4202,  -0.0221,  -0.9289,  -0.7962,  -0.5087,   0.9179,   0.8515,  -0.5308,  -0.2101,   0.1327,   0.0221
expec:   0.2101,  -0.2101,  -0.5750,   0.7852,   0.4202,  -0.0221,  -0.9289,  -0.7962,  -0.5087,   0.9179,   0.8515,  -0.5308,  -0.2101,   0.1327,   0.0221

Test code is here:

https://github.com/Xilinx/finn/blob/dev/tests/brevitas/test_brevitas_scaled_QHardTanh_export.py

Let me know if you want my help to move these tests into the Brevitas-FINN integration testsuite (best tracked in a separate issue) or for any other debugging.

Jun 17 '20 10:06 maltanar

I've been debugging the different outputs produced by the ONNX-exported 4-bit MobileNet-v1 and its Brevitas implementation, and it looks like this issue is also the cause of the problem there. For instance the first activation (4-bit QuantReLU with channelwise scaling) in the network produces 394272 values, 394270 are the same between FINN and Brevitas, but 2 of them are different. Here is one example:

input = [[0.54479885]]
out_scale = [[0.12106641]]
Brevitas QuantReLU output (as integers) = [[4]]
Exported MultiThreshold output = [[5]]
Exported threshold values (4-bit QuantReLU so 15 thresholds) = 
[[0.0605332 , 0.18159962, 0.302666  , 0.42373243, 0.54479885,
       0.66586524, 0.7869317 , 0.9079981 , 1.0290644 , 1.1501309 ,
       1.2711972 , 1.3922637 , 1.5133301 , 1.6343964 , 1.7554629 ],
]

In this case the input value seems to be equal to the exported threshold. This could mean either

The exported threshold values aren't quite correct and may need to be adjusted slightly
The exported threshold values are correct but FINN and Brevitas behave differently when value = threshold

Jul 17 '20 10:07 maltanar