DeepLearningExamples [Slowfast/Pytorch] more inference time in cuda env compared to cpu (occured only for a layer)

[Slowfast/Pytorch] more inference time in cuda env compared to cpu (occured only for a layer)

Open koalaaaaaaaaa opened this issue 2 years ago • 0 comments

Related to Slowfast/Pytorch

Describe the bug When I inference on a deep learning model (slowfast model), I'm facing a problem that my python program seems to take more inference time in cuda env compared to cpu. It's not the whole model but one specific layer takes more time on cuda env than cpu. I'm so confused that hope someone can help me with it. Here is the details. the specific layer is "slowway-conv1" layer as showned in the pic below representing the model structure of slowfast. And my confusing result is as follows. the first for cuda and the second for cpu. In cuda env, I found the processing time of "conv1" (0.97s) accounts for a great proportion of the processing time of the whole model (1.04s), while in cpu env, the processing time of "conv1" (0.07s) only accounts for a very small proportion of the processing time of the whole model (4.43s). And I reckon that the proportion in cpu env is reasonable considering the calculation budget. Is my method of time measurement mistaken? I used the following code to measure time cost. If it's my fault that causing the confusing result, please kindly point out, or please give me some ideas to help me solve this problem. Thank you very much!

Environment

Container version :pytorch:1.7.0-py3
GPUs in the system: GTX-1650-16GB
CUDA driver version : 457.66
CUDA version 10.2 V10.2.89

Mar 05 '22 14:03 koalaaaaaaaaa

DeepLearningExamples DeepLearningExamples copied to clipboard

[Slowfast/Pytorch] more inference time in cuda env compared to cpu (occured only for a layer)

DeepLearningExamples
DeepLearningExamples copied to clipboard