rknn-toolkit icon indicating copy to clipboard operation
rknn-toolkit copied to clipboard

模型量化失败,出现“ValueError: cannot convert float NaN to integer”错误

Open LS1030 opened this issue 2 years ago • 1 comments

问题描述: 量化onnx模型时,选择asymmetric_affine-u8的量化方式,quantized_algorithm="kl_divergence",量化失败并提示ValueError: cannot convert float NaN to integer,原模型输入格式为单通道图像,量化数据集为npy格式

环境: ubuntu18.04, rknn版本1.7.1, python3.6.7

代码:

rknn = RKNN()

rknn.config(
        mean_values=[[0]],
        std_values=[[1]],
        reorder_channel="0 1 2",
        quantized_dtype="asymmetric_affine-u8",
        epochs=40,
        batch_size=16,
        quantized_algorithm="kl_divergence", 
        # quantized_algorithm="normal",
        optimization_level=3,
    )

rknn.load_onnx(model="./Model/onnx/test.onnx")

rknn.build(do_quantization=True, dataset="./dataset/dataset_npy.txt", rknn_batch_size=-1)

rknn.export_rknn("./Model/rknn/tset.rknn")

rknn.release()

按此份代码运行时,在rknn.build的时候失败,提示错误信息:

E Catch exception when building RKNN model!
E Traceback (most recent call last):
E   File "rknn/base/RKNNlib/app/medusa/quantization.py", line 64, in rknn.base.RKNNlib.app.medusa.quantization.Quantization._run_quantization
E   File "rknn/base/RKNNlib/quantization/quantize_manager.py", line 379, in rknn.base.RKNNlib.quantization.quantize_manager.QuantizeManager.calculate_params
E   File "rknn/base/RKNNlib/quantization/quantize_manager.py", line 174, in rknn.base.RKNNlib.quantization.quantize_manager.QuantizeManager._cal_quantize_param_all
E   File "rknn/base/RKNNlib/quantization/asymmetric_quantizer.py", line 34, in rknn.base.RKNNlib.quantization.asymmetric_quantizer.AsymmetricQuantizer.cal_quantize_param_all
E   File "rknn/base/RKNNlib/quantization/utils.py", line 145, in rknn.base.RKNNlib.quantization.utils.get_max_min_range
E ValueError: cannot convert float NaN to integer
E Please feedback the detailed log file <log_feedback_to_the_rknn_toolkit_dev_team.log> to the RKNN Toolkit development team.
E You can also check github issues: https://github.com/rockchip-linux/rknn-toolkit/issues

如果把:

quantized_algorithm="kl_divergence"

改成:

quantized_algorithm="normal"

则可以量化,但是有警告信息:

W tensor @Add_Add16_113:out0 seems to be always 0, user might try to remove the correlative layer manually
W tensor @Conv_Conv16_115:out0 seems to be always 0, user might try to remove the correlative layer manually
W tensor @Add_Add15_123:out0 seems to be always 0, user might try to remove the correlative layer manually
W tensor @Conv_Conv15_125:out0 seems to be always 0, user might try to remove the correlative layer manually
W tensor @Add_Add14_129:out0 seems to be always 0, user might try to remove the correlative layer manually
W tensor @Conv_Conv14_131:out0 seems to be always 0, user might try to remove the correlative layer manually
W tensor @Conv_Conv16_115:weight seems to be always 0, user might try to remove the correlative layer manually
W tensor @Conv_Conv15_125:weight seems to be always 0, user might try to remove the correlative layer manually
W tensor @Conv_Conv14_131:weight seems to be always 0, user might try to remove the correlative layer manually

但是量化出来的模型推理结果都是0,与预期不符合。

出现的原因可能是什么?有没有可能是原模型的一些层的值本来就比较小,导致量化后太小了直接为0?

LS1030 avatar May 22 '22 14:05 LS1030

请问解决了么 求教

quanh1990 avatar Jun 18 '22 14:06 quanh1990