Error in offline int8 quantization of yolov8n model from ultralytics
Platform(Include target platform as well if cross-compiling):
aarch64, ubuntu20.04
Github版本:
commit a980dba3963efb0ad76b0f3caaf5c21556f69ffe (HEAD -> master, origin/master, origin/HEAD) Merge: 226f1bc1 1924cc17 Author: jxt1234 [email protected] Date: Sat Jun 15 16:22:48 2024 +0800
Compiling Method
cmake -DMNN_USE_OPENCV=ON -DMNN_IMGCODECS=ON -DMNN_BUILD_TOOL=ON -DMNN_BUILD_BENCHMARK=ON -DMNN_BUILD_CONVERTER=ON -DMNN_BUILD_QUANTOOLS=ON ..
Issue
I first use ultralytics to convert yolov8n.pt to onnx model yolov8n.onnx:
from ultralytics import YOLO model = YOLO("yolov8n.pt") model.export(format="onnx")
then convert it to mnn model as follows:
MNNConvert -f ONNX --modelFile .\yolov8n.onnx --MNNModel yolov8n.mnn --bizCode biz --keepInputFormat
The yolov8n.mnn works well using mnn-yolo.
However, the error occurs when I quantized the yolov8n.mnn using command:
../MNN/build/quantized.out ./checkpoints/yolov8n.mnn ./checkpoints/yolov8n_quant.mnn ./data/yolov8n_quant.json
with yolov8n_quant.json:
{
"format":"RGB",
"mean": [
0.0,
0.0,
0.0
],
"normal": [
0.003921,
0.003921,
0.003921
],
"width":640,
"height":640,
"path":"/home/nvidia/Documents/mnn-yolo/data/coco",
"used_image_num":32,
"feature_quantize_method": "KL",
"weight_quantize_method":"MAX_ABS",
"model":"../checkpoints/yolov8n.mnn",
"debug": false
}
the output:
[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:1278: >>> modelFile: ./checkpoints/yolov8n.mnn
[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:1279: >>> preTreatConfig: ./data/yolov8n_quant.json
[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:1280: >>> dstFile: ./checkpoints/yolov8n_quant_1.mnn
[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:1308: Calibrate the feature and quantize model...
[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:159: Use feature quantization method: KL
[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:160: Use weight quantization method: MAX_ABS
[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:180: feature_clamp_value: 127
[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:181: weight_clamp_value: 127
The device support i8sdot:1, support fp16:1, support i8mm: 0
[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/Helper.cpp:111: used image num: 32
[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:666: fake quant weights done.
ComputeFeatureRange: 100.00 %
CollectFeatureDistribution: 100.00 %
Can't find extraTensorDescribe for 427
[10:30:58] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:1316: Quantize model done!
it seems the model is quantized successfully, but the inference results of yolov8n_quant.mnn are totally wrong. Then, I set debug as true in json file, the output become:
[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:1278: >>> modelFile: ./checkpoints/yolov8n.mnn
[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:1279: >>> preTreatConfig: ./data/yolov8n_quant.json
[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:1280: >>> dstFile: ./checkpoints/yolov8n_quant_1.mnn
[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:1308: Calibrate the feature and quantize model...
[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:159: Use feature quantization method: KL
[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:160: Use weight quantization method: MAX_ABS
[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:180: feature_clamp_value: 127
[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:181: weight_clamp_value: 127
The device support i8sdot:1, support fp16:1, support i8mm: 0
[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/Helper.cpp:111: used image num: 32
[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:666: fake quant weights done.
ComputeFeatureRange: 100.00 %
CollectFeatureDistribution: 100.00 %
[10:40:12] /home/nvidia/Documents/MNN/tools/quantization/TensorStatistic.cpp:331: Check failed: count == fakeQuantedFeature.size() (1638400 vs. 0) feature size errorSegmentation fault (core dumped)
The quantization process failed.
Please give me some suggestions, thank you!
上面已经提示了“Can't find extraTensorDescribe for 427”这个错误。在解决了
上面已经提示了“Can't find extraTensorDescribe for 427”这个错误。在解决了
这个问题解决了吗
Marking as stale. No activity in 60 days.