candle
candle copied to clipboard
Extreme slow inference speed on CPU when trying blip example
I have followed the tutorial and set up my first rust example. However, I found that the inference speed is faster compared to torch on GPU (780ms per image vs 800ms on my machine). The example is extremely slow when I change the device to CPU. It takes 57 sec to get the image features and 42 sec to decode the feature. While the torch model only needs 2 sec to finish the inference.
Did you make sure to do cargo run --release
?
Thanks, I was running in debug, The release is much faster, takes 4sec to finish the inference. I will try to build with mkl to see if it will be faster.
Any updates?
+1