TensorRT icon indicating copy to clipboard operation
TensorRT copied to clipboard

❓ [Question] a10 performance drop significantly

Open ArtemisZGL opened this issue 1 year ago • 2 comments

❓ Question

I converted the gfpgan model (https://github.com/TencentARC/GFPGAN) with torch_tensorrt, and I found torch_tensorrt is twice as fast as torch in 3070. But in one a10 server, torch_tensorrt and torch are closed; In other a10 server, torch_tensorrt is even twice as slow as torch. Statics shows below. (two type of a10 from two difference cloud server).

GPU CPU CPU core CPU freq memory inference framework CPU usage memory usage GPU usage inference time
3070 AMD Ryzen 7 5800X 8-Core Processor 16 2200-3800MHz 32G pytorch 30-35% 160-170% 13.5g 987.7m 33.889511s
3070 torch_tensorrt 15-20% 180-200% 11.7g 1.1g 16.259879s
a10(v1) Intel (R) Xeon (R) Platinum 8350C CPU @ 2.60GHz 28 2593MHz 112G pytorch 25-30% 190-200% 15.1g 1.2g 33.933190s
a10(v1) torch_tensorrt 15-20% 190-200% 13.0g 1.2g 31.899047s
a10(v2) Intel(R) Xeon(R) Platinum 8336C CPU @ 2.30GHz 28 2300-4600MHz 112G pytorch 20-30% 180-200% 15.1g 1.0g 34.027398s
a10(v2) torch_tensorrt 10-15% 160-170% 13.1g 1.1g 66.498723s

I also tried torch2trt(https://github.com/NVIDIA-AI-IOT/torch2trt) and fixed some op error, finding it's twice as fast as torch_tensorrt in 3070. And performance didn't drop so strangely in a10 server.

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

  • PyTorch Version (e.g., 1.0): nvcr.io/nvidia/pytorch:23.08-py3
  • CPU Architecture: as above
  • OS (e.g., Linux): linux
  • How you installed PyTorch (conda, pip, libtorch, source): docker
  • Build command you used (if compiling from source):
  • Are you using local sources or building from archives:
  • Python version:
  • CUDA version:
  • GPU models and configuration: as above
  • Any other relevant information:

Additional context

ArtemisZGL avatar Dec 25 '23 08:12 ArtemisZGL

Hi - thank you for the report. I noticed these results are using the 23.08-py3 container, and there have been many upgrades and changes to Torch-TensorRT since the version which was packaged in that container. The latest version of Torch-TensorRT can be found in the hosted Docker containers, and would be worth trying for comparison against the above.

# For latest nightly version of Torch-TRT + Torch
docker pull ghcr.io/pytorch/tensorrt/torch_tensorrt:nightly
# For latest stable Torch + RC of Torch-TRT
docker pull ghcr.io/pytorch/tensorrt/torch_tensorrt:release_2.1

gs-olive avatar Dec 27 '23 18:12 gs-olive

@gs-olive Thanks, I will have a try as soon as possible.

ArtemisZGL avatar Jan 05 '24 02:01 ArtemisZGL