Quantitative models are slower than the original models

Open leizhu1989 opened this issue 2 years ago • 1 comments

hello, when I try in c++ project to infer the Quantized models，I find it is slower than original float32 models. why is it?

Dec 07 '23 01:12 leizhu1989

c++ project ? Do you rewrite the function of "def callback(indata, outdata, frames, buftime, status)" using c++ ?

Oct 31 '24 14:10 neuxys

Running the tflite runtime the quantised models are much faster on the intended Pi hardware.

Jul 11 '25 06:07 StuartIanNaylor