TensorRT icon indicating copy to clipboard operation
TensorRT copied to clipboard

performance of concurrent with different module.

Open LightSun opened this issue 1 year ago • 7 comments

Description

I have two different module and convert to trt. when I run them in Serial. the cost time of only infer:

//10 times
do_infer >> cost 400.60 msec. //warn-up
do_infer >> cost 42.22 msec. 
do_infer >> cost 11.69 msec.
do_infer >> cost 37.12 msec.
do_infer >> cost 9.97 msec.
do_infer >> cost 33.33 msec.
do_infer >> cost 9.87 msec.
do_infer >> cost 34.53 msec.
do_infer >> cost 9.96 msec.
do_infer >> cost 34.66 msec.
do_infer >> cost 10.88 msec.
do_infer >> cost 35.41 msec.
do_infer >> cost 11.10 msec.
do_infer >> cost 33.84 msec.
do_infer >> cost 10.00 msec.
do_infer >> cost 33.42 msec.
do_infer >> cost 10.08 msec.
do_infer >> cost 34.65 msec.
do_infer >> cost 10.63 msec.
do_infer >> cost 34.66 msec.

in Parallel with two threads.

do_infer >> cost 408.90 msec   //warn-up
do_infer >> cost 407.14 msec.  //warn-up
do_infer >> cost 11.41 msec.
do_infer >> cost 50.40 msec.
do_infer >> cost 13.05 msec.
do_infer >> cost 50.36 msec.
do_infer >> cost 44.26 msec.
do_infer >> cost 43.02 msec.
do_infer >> cost 44.29 msec.
do_infer >> cost 43.25 msec.
do_infer >> cost 50.69 msec.
do_infer >> cost 49.08 msec.
do_infer >> cost 48.10 msec.
do_infer >> cost 47.28 msec.
do_infer >> cost 50.19 msec.
do_infer >> cost 48.67 msec.
do_infer >> cost 47.18 msec.
do_infer >> cost 46.64 msec.
do_infer >> cost 12.24 msec.
do_infer >> cost 46.06 msec.

why performance is not good ?

Environment

TensorRT Version: 8.4.3.1

NVIDIA GPU: RTX3070

NVIDIA Driver Version: 470.74

CUDA Version: 11.4

CUDNN Version: 8.5.0

Operating System: ubuntu-18.04

Python Version (if applicable):

Tensorflow Version (if applicable):

PyTorch Version (if applicable):

Baremetal or Container (if so, version):

LightSun avatar Sep 03 '24 07:09 LightSun