liam_sun
liam_sun
Hi @keidav01 , I have not yet conducted tests about this issue. I think you can just close it and I will feedback ASAP. Thanks for your help. Liam
另外一个比较奇怪的现象是,转成tmfile后,relu操作都看不到了
我用源码编译出convert tool,之前没有经过onnxsim处理的onnx模型可以直接进行转换了,但运行时仍然有相同错误
inference失败的原因,看起来是因为设定了precision为TENGINE_MODE_FP16的缘故,而转换后的tmfile模型,权重看起来是float32。改为TENGINE_MODE_FP32就可以正常推理。这里再问一下,有没有方法可以在转换tmfile的时候,就把权重转成fp16?MNN和NCNN都有类似的功能。
I'm also confused about that. It's supposed to be 1/128=0.0078125.
My quantization strategy: quantizer = PostQuantizer(model, dummy_input, work_dir='out', config={'force_overwrite': True, 'rewrite_graph': True, 'is_input_quantized': None, 'asymmetric': False, 'per_tensor': False}) 。。。。。。。。。。。。。。。。。。。。。。。。。。 converter = TFLiteConverter(ptq_model, dummy_input, tflite_path='out/ptq_model.tflite', strict_symmetric_check=False, quantize_target_type='int8')
The following strategy works: quantizer = PostQuantizer(model, dummy_input, work_dir='out', config={'force_overwrite': True, 'rewrite_graph': True, 'is_input_quantized': None, 'asymmetric': True, 'per_tensor': True}) 。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。 converter = TFLiteConverter(ptq_model, dummy_input, tflite_path='out/ptq_model.tflite', strict_symmetric_check=False, quantize_target_type='uint8')
Looks int8 per-channel quantization may incur errors.
Big thanks. Wait for your fix.
Actually, I did train an asymmetric per-channel QAT network based on the same source model. But the resulted tflite model has exactly the same weights/bias with symmetric per-channel.