mxnet-ssd icon indicating copy to clipboard operation
mxnet-ssd copied to clipboard

Slower than Caffe at test time

Open cory8249 opened this issue 8 years ago • 7 comments

Hi,

I use both this version and original caffe version, and test same image (data/demo/dog.jpg) with GTX1070 GPU, CUDA 8.0, CUDNN 5.1

Caffe: 14 ms (71 fps) MXNet: 33 ms (33 fps)

MXNet is 2x slower, is it normal ?

Thanks a lot

cory8249 avatar Jan 02 '17 17:01 cory8249

Don't measure with single image. The time spend in loading the library is different.

zhreshold avatar Jan 03 '17 05:01 zhreshold

I want to build a real-time video object detection system, so input are fed one by one continuously. I originally used "caffe time" script command to benchmark, but I found that it is much faster than what I really measured in python. Here is my new fair test: Caffe: 28 ms MXNet: 33 ms

Thanks

cory8249 avatar Jan 04 '17 15:01 cory8249

Are you using detector.detect to process images one by one? I admit there's more we could optimize, such as the way we load and resize the inputs.

zhreshold avatar Jan 05 '17 06:01 zhreshold

I simply use im_detect() in a for loop. I found that it will create testDB (seemed not necessary for my app), so I am going to modify it myself.

cory8249 avatar Jan 05 '17 09:01 cory8249

@cory8249 I found if you just test the runtime of self.mod.predict(det_iter) not self.mod.predict(det_iter).asnumpy() in detector.detect, the speed result will be much improved. For more solutions, you can go to https://github.com/zhreshold/mxnet-ssd/issues/4

niufuren avatar Jan 17 '17 07:01 niufuren

@niufuren when i modify as you said. Can you give me some solutions. I get the error:

Detection time for 1 images: 0.0639 sec Traceback (most recent call last): File "demo_cell.py", line 133, in class_names, args.thresh, args.show_timer) File "/home/mxnet/example/ssd/detect/detector.py", line 167, in detect_and_visualize dets = self.im_detect(im_list, root_dir, extension, show_timer=show_timer) File "/home/mxnet/example/ssd/detect/detector.py", line 98, in im_detect return self.detect(test_iter, show_timer) File "/home/mxnet/example/ssd/detect/detector.py", line 72, in detect res = det[np.where(det[:, 0] >= 0)[0]] File "/home/anaconda2/lib/python2.7/site-packages/mxnet-0.10.1-py2.7.egg/mxnet/ndarray.py", line 504, in getitem str(key), str(type(key)))) File "/home/anaconda2/lib/python2.7/site-packages/numpy/core/numeric.py", line 1869, in array_str return array2string(a, max_line_width, precision, suppress_small, ' ', "", str) File "/home/anaconda2/lib/python2.7/site-packages/numpy/core/arrayprint.py", line 442, in array2string elif reduce(product, a.shape) == 0: File "/home/anaconda2/lib/python2.7/site-packages/mxnet-0.10.1-py2.7.egg/mxnet/ndarray.py", line 274, in bool raise ValueError("The truth value of an NDArray is ambiguous. "
ValueError: The truth value of an NDArray is ambiguous. Please convert to number with asscalar() first.

315386775 avatar Sep 27 '17 06:09 315386775

@niufuren @315386775 ndarray api make some computation asynchronously. https://github.com/apache/incubator-mxnet/issues/6974 so if you want to time ndarray computation, you need to be sure computation are finished. You can either call asnumpy() or wait_to_read()

edmBernard avatar Sep 27 '17 17:09 edmBernard