tensorflow-yolov4-tflite
tensorflow-yolov4-tflite copied to clipboard
Yolov4 quantization and latency
After TF Lite quantization the size of Yolov4 tiny model is reduced indeed. But the latency is increasing. For dynamic-range quantization up to 2-3 times. For int8 - up to 4-5 times. I tested it on desktop linux (x86-64) and Raspberry 3 (armv7). The result is same. Is it the problem that TF Lite optimizer doesn't support Yolov4 tiny layers?
Does anyone have any insight on this? I'm wondering what datatypes to quantize for for acceleration on arm devices like RPi...