caffe
caffe copied to clipboard
Inference time, after loading the weights, is slower than ./build/tools/caffe time
When I ran ./build/tools/caffe time
, I got
I0716 10:14:52.669873 18718 caffe.cpp:656] Average Forward pass: 11.4608 ms.
When I ran ./build/examples/ssd/ssd_detect.bin
and time the forward function, I got timing like these
time: 23.809 ms
time: 22.517 ms
time: 23.631 ms
time: 22.49 ms
time: 23.481 ms
time: 21.887 ms
time: 23.696 ms
time: 22.322 ms
time: 23.026 ms
time: 23.716 ms
time: 22.506 ms
time: 22.152 ms
time: 23.222 ms
time: 21.964 ms
time: 23.871 ms
time: 22.715 ms
time: 23.888 ms
time: 22.232 ms
time: 23.315 ms
These are my codes https://drive.google.com/drive/folders/1cAhF9wBNjBpO9Ykoh80Sv5eBqBZJwDYQ?usp=sharing
@jazzseow could you upload the commands, their outputs and prototxt files used?
@drnikolaev Thank you for your reply. I have uploaded the required files to https://drive.google.com/open?id=1cAhF9wBNjBpO9Ykoh80Sv5eBqBZJwDYQ
Also, I have uploaded the modifications required to run RefineDet model, under include/ and src/ folders
A-ha, seems like a bug: when you run caffe time
convolution algos get optimized like this:
I0719 11:48:43.798629 11106 cudnn_conv_layer.cpp:857] [n0.d0.r0] Conv Algos (F,BD,BF): 'conv3_1' with space 0.08G 63/1 6 1 0 (avail 9.72G, req 0.08G) t: 0 0 0.6
I0719 11:48:44.033376 11106 cudnn_conv_layer.cpp:857] [n0.d0.r0] Conv Algos (F,BD,BF): 'conv3_2' with space 0.09G 233/1 6 1 5 (avail 9.7G, req 0.09G) t: 0 0 1.11
I0719 11:48:44.282755 11106 cudnn_conv_layer.cpp:857] [n0.d0.r0] Conv Algos (F,BD,BF): 'conv3_3' with space 0.09G 233/1 6 1 5 (avail 9.68G, req 0.09G) t: 0 0 1.15
But caffe test
doesn't.
Could you try to comment out lines
https://github.com/NVIDIA/caffe/blob/caffe-0.17/src/caffe/layers/cudnn_conv_layer.cpp#L450
https://github.com/NVIDIA/caffe/blob/caffe-0.17/src/caffe/layers/cudnn_conv_layer.cpp#L456
and retry caffe test
?
@drnikolaev So I tried this
if (!use_modest_workspace()) {
// if (this->phase_ == TRAIN) {
// Now taking the rest for running FindEx calls
// We'll release what's possible in BW pass
LOG(INFO); // line 453
AllocateFindExWorkspace();
// Also used by Test Net but based on shared space taken by Train:
LOG(INFO); // line 456
FindExConvAlgo(bottom, top);
LOG(INFO); // line 458
// }
use_algo_seeker_ = false;
}
caffe time
works fine.
But it resulted in Segmentation Fault() on FindExConvAlgo(bottom, top);
when i run ssd_detect
.
I0719 16:18:22.243350 23616 cudnn_conv_layer.cpp:453]
I0719 16:18:22.248224 23616 cudnn_conv_layer.cpp:456]
Segmentation fault (core dumped)