mxnet-ssd icon indicating copy to clipboard operation
mxnet-ssd copied to clipboard

The detect speed is slow!

Open yhl41001 opened this issue 7 years ago • 11 comments

I the the demo with mobilenet512 and resnet50_512 model, but it is very slow, it cost 1.8480 seconds and 4.9594 seconds respectively? Is this normal?

yhl41001 avatar Jan 09 '18 11:01 yhl41001

I test with CPU

yhl41001 avatar Jan 09 '18 11:01 yhl41001

Is the forward speed relative to the trainning convergence? I used mxnet PredictorHandle (https://mxnet.incubator.apache.org/doxygen/c__predict__api_8h.html) to do the test. And calculated the forward time, And I got the experiments result below: wx20180115-111637 2x Right now I am trying to reduce the loss. On the other hand, I wonder if there is a relationship between trainning loss and forward time cost? Thank you. @zhreshold

yanhn avatar Jan 15 '18 03:01 yanhn

@yanhn CNN forward time is not affected by convergence. The time difference resides in NMS operation. Good model returns a lot of background regions which have very low scores, and is excluded by NMS.

zhreshold avatar Jan 15 '18 06:01 zhreshold

Thank you. It helped. I checked the output num of multiboxdetection layer. And found the bounding box number range from 160(good model) to ~3500(bad model). I set the threshold paramter to 0.1 and it ran a lot faster(~35ms -> ~20ms).

yanhn avatar Jan 15 '18 09:01 yanhn

@yanhn @zhreshold where to change the threshold, and you test with GPU or CPU.

yhl41001 avatar Jan 16 '18 02:01 yhl41001

Test with GPU, I changed the parameter in xx-symbol.json by adding "threshold": "0.1",

    {
      "op": "_contrib_MultiBoxDetection", 
      "name": "detection", 
      "attrs": {
        "force_suppress": "True", 
        "nms_threshold": "0.5", 
        "threshold": "0.1",
        "nms_topk": "400", 
        "variances": "(0.1, 0.1, 0.2, 0.2)"
      }, 
      "inputs": [[134, 0, 0], [165, 0, 0], [179, 0, 0]]
    }

Don't know why but it accelerated my demo speed.

yanhn avatar Jan 16 '18 02:01 yanhn

@yanhn @zhreshold I have try nms_thresh from 0.1 to 0.8, but the time almost the same, about 2s.

yhl41001 avatar Jan 16 '18 02:01 yhl41001

I have test your method with different from 0.1 to 0.95, but the time is still 2s.

yhl41001 avatar Jan 16 '18 03:01 yhl41001

@zhreshold If I could set some parameters to accelerate the predict speed?

yhl41001 avatar Jan 16 '18 07:01 yhl41001

@yhl41001 You can set nms threshold to 0 to disable NMS, you will get the network forward time, that's the best you can get. Typically I would suggest using mxnet with MKL build if you are using intel cpu.

zhreshold avatar Jan 16 '18 20:01 zhreshold

@zhreshold I have set --nms 0, but the cost time is still 2s, but the tensorflow offered the mobilenet+ssd model cost just about 50ms with sse. How much could the mkl accelerate than openblas?

yhl41001 avatar Jan 17 '18 02:01 yhl41001