tensorrtx icon indicating copy to clipboard operation
tensorrtx copied to clipboard

3D网络耗时很久

Open Cuzny opened this issue 3 years ago • 3 comments

自己定义了一个3D的网络(slowfast),但是发现耗时很长,在pytorch端需要60ms+,但是使用C++ API定义之后需要1s左右,FP16模式下350ms左右。打印了一下耗时,好像是conv3d引起的。请问一下大佬知道是实名原因吗?

Env

  • GPU, Xavier NX.
  • OS, Ubuntu18.04.
  • Cuda version 10.2
  • TensorRT version 8.2.1

Cuzny avatar Jul 19 '22 05:07 Cuzny

Did you try to run a loop? The GPU might needs warmup.

wang-xinyu avatar Jul 20 '22 05:07 wang-xinyu

我先跑了20次inference才测的速度,我想应该是Xavier NX不太支持3D卷积的缘故。

Cuzny avatar Jul 20 '22 05:07 Cuzny

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Sep 18 '22 08:09 stale[bot]