MobileNet-SSD processing time

thank you very much for your work. i have trained my own data and deployed the model (300x300) with visual studio 2013 on one GPU gtx 1070. but the detection time was 35ms per image, the same with VGG-16 model. what could be the problem?

Jul 31 '17 16:07 xyoungli

The depth-wise convolution is implemented by 'group' parameter, that's the bottleneck. I have implemented it myself, and the performance is better than tinyyolo. I will open the source later.

Aug 01 '17 14:08 chuanqi305

thank you very much, and i am looking forward to testing on the new source.

Aug 02 '17 15:08 xyoungli

I am also looking forward to your implementation of depth-wise convolution,when will you plan to release that source? @chuanqi305

Aug 10 '17 08:08 firefox1031

@chuanqi305
I replace the convolution layer with depth-wise convolution layer and still have the same problem.
the MobileNet-SSD model(512x512) cost 38ms(no cudnn) and the VGG-16 model cost 32ms(cudnn) per image on the GTX1080. what about your performance?

Aug 18 '17 02:08 birdwcp

@birdwcp do you implement depth-wise convolution layer by yourself, or find it somewhere else? I am also trying to figure out the GPU Time problem.Thanks!

Aug 21 '17 03:08 abrams90

@abrams90 I use this https://github.com/farmingyard/caffe-mobilenet the MobileNet-SSD model(300x300) cost only 7ms(no cudnn)per image on the GTX1080

Aug 21 '17 06:08 birdwcp

@birdwcp Thank you very much,it's very helpfull!

Aug 21 '17 08:08 abrams90

@chuanqi305 looking forward to your test time of forward caculation in gpu.what really bothering me is even with depth_wise conv layer replaced ,my gpu time in 1070 is just 2x faster than cpu time in i5,and still cost nearly 100ms.It's even slower than vgg16.Thanks!

Aug 23 '17 03:08 abrams90

You can solve the processing time problem from https://github.com/yonghenglh6/DepthwiseConvolution

Sep 11 '17 11:09 xizi

How do you implement this? Can you provide source code？ @birdwcp

Oct 17 '17 08:10 chl916185

@chuanqi305 why don't convolution depth wise use engine: CUDNN instead of engine: Caffe? In the layers of conv1/dw and conv2/dw and so on in MobileNet-SSD/train.prototxt, setting engine: CAFFE that means the layers can't use gpu to speed up. So why not use engine: CUDNN?

Sep 03 '18 02:09 ZhiqiJiang

@chuanqi305 @birdwcp How do you modify the MobileNet-SSD network from 300x300 to 512x512? Can you please post the 512x512 train.prototxt file?

Jun 24 '19 06:06 adithya-p

MobileNet-SSD MobileNet-SSD copied to clipboard

processing time

MobileNet-SSD
MobileNet-SSD copied to clipboard