mxnet-ssd
mxnet-ssd copied to clipboard
Slower than Caffe at test time
Hi,
I use both this version and original caffe version, and test same image (data/demo/dog.jpg) with GTX1070 GPU, CUDA 8.0, CUDNN 5.1
Caffe: 14 ms (71 fps) MXNet: 33 ms (33 fps)
MXNet is 2x slower, is it normal ?
Thanks a lot
Don't measure with single image. The time spend in loading the library is different.
I want to build a real-time video object detection system, so input are fed one by one continuously. I originally used "caffe time" script command to benchmark, but I found that it is much faster than what I really measured in python. Here is my new fair test: Caffe: 28 ms MXNet: 33 ms
Thanks
Are you using detector.detect to process images one by one? I admit there's more we could optimize, such as the way we load and resize the inputs.
I simply use im_detect() in a for loop. I found that it will create testDB (seemed not necessary for my app), so I am going to modify it myself.
@cory8249 I found if you just test the runtime of self.mod.predict(det_iter)
not self.mod.predict(det_iter).asnumpy()
in detector.detect, the speed result will be much improved. For more solutions, you can go to https://github.com/zhreshold/mxnet-ssd/issues/4
@niufuren when i modify as you said. Can you give me some solutions. I get the error:
Detection time for 1 images: 0.0639 sec
Traceback (most recent call last):
File "demo_cell.py", line 133, in
ValueError: The truth value of an NDArray is ambiguous. Please convert to number with asscalar() first.
@niufuren @315386775 ndarray api make some computation asynchronously. https://github.com/apache/incubator-mxnet/issues/6974 so if you want to time ndarray computation, you need to be sure computation are finished. You can either call asnumpy() or wait_to_read()