caffe icon indicating copy to clipboard operation
caffe copied to clipboard

Inference time, after loading the weights, is slower than ./build/tools/caffe time

Open jazzseow opened this issue 5 years ago • 4 comments

When I ran ./build/tools/caffe time, I got I0716 10:14:52.669873 18718 caffe.cpp:656] Average Forward pass: 11.4608 ms.

When I ran ./build/examples/ssd/ssd_detect.bin and time the forward function, I got timing like these time: 23.809 ms time: 22.517 ms time: 23.631 ms time: 22.49 ms time: 23.481 ms time: 21.887 ms time: 23.696 ms time: 22.322 ms time: 23.026 ms time: 23.716 ms time: 22.506 ms time: 22.152 ms time: 23.222 ms time: 21.964 ms time: 23.871 ms time: 22.715 ms time: 23.888 ms time: 22.232 ms time: 23.315 ms

These are my codes https://drive.google.com/drive/folders/1cAhF9wBNjBpO9Ykoh80Sv5eBqBZJwDYQ?usp=sharing

jazzseow avatar Jul 16 '19 02:07 jazzseow

@jazzseow could you upload the commands, their outputs and prototxt files used?

drnikolaev avatar Jul 18 '19 08:07 drnikolaev

@drnikolaev Thank you for your reply. I have uploaded the required files to https://drive.google.com/open?id=1cAhF9wBNjBpO9Ykoh80Sv5eBqBZJwDYQ

Also, I have uploaded the modifications required to run RefineDet model, under include/ and src/ folders

jazzseow avatar Jul 19 '19 06:07 jazzseow

A-ha, seems like a bug: when you run caffe time convolution algos get optimized like this:

I0719 11:48:43.798629 11106 cudnn_conv_layer.cpp:857] [n0.d0.r0] Conv Algos (F,BD,BF): 'conv3_1' with space 0.08G 63/1 6 1 0 	(avail 9.72G, req 0.08G)	t: 0 0 0.6
I0719 11:48:44.033376 11106 cudnn_conv_layer.cpp:857] [n0.d0.r0] Conv Algos (F,BD,BF): 'conv3_2' with space 0.09G 233/1 6 1 5 	(avail 9.7G, req 0.09G)	t: 0 0 1.11
I0719 11:48:44.282755 11106 cudnn_conv_layer.cpp:857] [n0.d0.r0] Conv Algos (F,BD,BF): 'conv3_3' with space 0.09G 233/1 6 1 5 	(avail 9.68G, req 0.09G)	t: 0 0 1.15

But caffe test doesn't. Could you try to comment out lines https://github.com/NVIDIA/caffe/blob/caffe-0.17/src/caffe/layers/cudnn_conv_layer.cpp#L450 https://github.com/NVIDIA/caffe/blob/caffe-0.17/src/caffe/layers/cudnn_conv_layer.cpp#L456 and retry caffe test?

drnikolaev avatar Jul 19 '19 07:07 drnikolaev

@drnikolaev So I tried this

if (!use_modest_workspace()) {
    // if (this->phase_ == TRAIN) {
    // Now taking the rest for running FindEx calls
    // We'll release what's possible in BW pass
    LOG(INFO); // line 453
    AllocateFindExWorkspace();
    // Also used by Test Net but based on shared space taken by Train:
    LOG(INFO); // line 456
    FindExConvAlgo(bottom, top);
    LOG(INFO); // line 458
    // }
    use_algo_seeker_ = false;
}

caffe time works fine. But it resulted in Segmentation Fault() on FindExConvAlgo(bottom, top); when i run ssd_detect.

I0719 16:18:22.243350 23616 cudnn_conv_layer.cpp:453] 
I0719 16:18:22.248224 23616 cudnn_conv_layer.cpp:456] 
Segmentation fault (core dumped)

jazzseow avatar Jul 19 '19 08:07 jazzseow