MobileNet-SSD icon indicating copy to clipboard operation
MobileNet-SSD copied to clipboard

processing time

Open xyoungli opened this issue 7 years ago • 12 comments

thank you very much for your work. i have trained my own data and deployed the model (300x300) with visual studio 2013 on one GPU gtx 1070. but the detection time was 35ms per image, the same with VGG-16 model. what could be the problem?

xyoungli avatar Jul 31 '17 16:07 xyoungli

The depth-wise convolution is implemented by 'group' parameter, that's the bottleneck. I have implemented it myself, and the performance is better than tinyyolo. I will open the source later.

chuanqi305 avatar Aug 01 '17 14:08 chuanqi305

thank you very much, and i am looking forward to testing on the new source.

xyoungli avatar Aug 02 '17 15:08 xyoungli

I am also looking forward to your implementation of depth-wise convolution,when will you plan to release that source? @chuanqi305

firefox1031 avatar Aug 10 '17 08:08 firefox1031

@chuanqi305
I replace the convolution layer with depth-wise convolution layer and still have the same problem.
the MobileNet-SSD model(512x512) cost 38ms(no cudnn) and the VGG-16 model cost 32ms(cudnn) per image on the GTX1080. what about your performance?

birdwcp avatar Aug 18 '17 02:08 birdwcp

@birdwcp do you implement depth-wise convolution layer by yourself, or find it somewhere else? I am also trying to figure out the GPU Time problem.Thanks!

abrams90 avatar Aug 21 '17 03:08 abrams90

@abrams90 I use this https://github.com/farmingyard/caffe-mobilenet the MobileNet-SSD model(300x300) cost only 7ms(no cudnn)per image on the GTX1080

birdwcp avatar Aug 21 '17 06:08 birdwcp

@birdwcp Thank you very much,it's very helpfull!

abrams90 avatar Aug 21 '17 08:08 abrams90

@chuanqi305 looking forward to your test time of forward caculation in gpu.what really bothering me is even with depth_wise conv layer replaced ,my gpu time in 1070 is just 2x faster than cpu time in i5,and still cost nearly 100ms.It's even slower than vgg16.Thanks!

abrams90 avatar Aug 23 '17 03:08 abrams90

You can solve the processing time problem from https://github.com/yonghenglh6/DepthwiseConvolution

xizi avatar Sep 11 '17 11:09 xizi

How do you implement this? Can you provide source code? @birdwcp

chl916185 avatar Oct 17 '17 08:10 chl916185

@chuanqi305 why don't convolution depth wise use engine: CUDNN instead of engine: Caffe? In the layers of conv1/dw and conv2/dw and so on in MobileNet-SSD/train.prototxt, setting engine: CAFFE that means the layers can't use gpu to speed up. So why not use engine: CUDNN?

ZhiqiJiang avatar Sep 03 '18 02:09 ZhiqiJiang

@chuanqi305 @birdwcp How do you modify the MobileNet-SSD network from 300x300 to 512x512? Can you please post the 512x512 train.prototxt file?

adithya-p avatar Jun 24 '19 06:06 adithya-p