TensorRT
TensorRT copied to clipboard
🐛 [Bug] Torch-TensorRT doesn't support timm nfnet
Bug Description
Program crashes when running the following benchmarking script:
import torch
import torch_tensorrt
import timm
import time
import numpy as np
import torch.backends.cudnn as cudnn
torch.hub._validate_not_a_forked_repo=lambda a,b,c: True
nfnet = timm.create_model('dm_nfnet_f0',pretrained=True)
model = nfnet.eval().to("cuda")
detections_batch = model(torch.randn(128, 3, 224, 224).to("cuda"))
detections_batch.shape
cudnn.benchmark = True
def benchmark(model, input_shape=(1024, 3, 512, 512), dtype='fp32', nwarmup=50, nruns=1000):
input_data = torch.randn(input_shape)
input_data = input_data.to("cuda")
if dtype=='fp16':
input_data = input_data.half()
print("Warm up ...")
with torch.no_grad():
for _ in range(nwarmup):
features = model(input_data)
torch.cuda.synchronize()
print("Start timing ...")
timings = []
with torch.no_grad():
for i in range(1, nruns+1):
start_time = time.time()
pred_loc = model(input_data)
torch.cuda.synchronize()
end_time = time.time()
timings.append(end_time - start_time)
if i%10==0:
print('Iteration %d/%d, avg batch time %.2f ms'%(i, nruns, np.mean(timings)*1000))
print("Input shape:", input_data.size())
print('Average throughput: %.2f images/second'%(input_shape[0]/np.mean(timings)))
trt_model = torch_tensorrt.compile(model,
inputs= [torch_tensorrt.Input((1, 3, 224, 224))],
enabled_precisions= { torch_tensorrt.dtype.half} # Run with FP16
)
benchmark(trt_model, input_shape=(1, 3, 224, 224), nruns=100, dtype="fp16")
To Reproduce
Steps to reproduce the behavior:
- Run the script
- Error message:
WARNING: [Torch-TensorRT] - Cannot infer input type from calcuations in graph for input x.1. Assuming it is Float32. If not, specify input type explicity
ERROR: [Torch-TensorRT] - Unsupported operator: aten::ceil.float(float a) -> (int)
File "/data/home/xzhao9/cluster/miniconda3/envs/py38/lib/python3.8/site-packages/timm/models/layers/padding.py", line 19
def get_same_padding(x: int, k: int, s: int, d: int):
return max((math.ceil(x / s) - 1) * s + (k - 1) * d + 1 - x, 0)
~~~~~~~~~ <--- HERE
ERROR: [Torch-TensorRT] - Unsupported operator: aten::ceil.float(float a) -> (int)
File "/data/home/xzhao9/cluster/miniconda3/envs/py38/lib/python3.8/site-packages/timm/models/layers/padding.py", line 19
def get_same_padding(x: int, k: int, s: int, d: int):
return max((math.ceil(x / s) - 1) * s + (k - 1) * d + 1 - x, 0)
~~~~~~~~~ <--- HERE
WARNING: [Torch-TensorRT] - Input type for doing shape analysis could not be determined, defaulting to F32
Segmentation fault (core dumped)
Expected behavior
Shouldn't crash and should print performance results
Environment
Build information about Torch-TensorRT can be found by turning on debug messages
- Torch-TensorRT Version (e.g. 1.0.0): master (4fd886d08ce77323995b5bf6a21a0d0e8dde8d42)
- PyTorch Version (e.g. 1.0): 1.10.0+cu113
- CPU Architecture: AWS p3d.24xlarge instance
- OS (e.g., Linux): Linux
- How you installed PyTorch (
conda
,pip
,libtorch
, source): pip - Build command you used (if compiling from source): python setup.py bdist_wheel
- Are you using local sources or building from archives: local sources from github
- Python version: 3.8
- CUDA version: 11.3
- GPU models and configuration: Nvidia V100
- Any other relevant information:
Hey, we have run in the same issue a while back, timm is not very well implemented for inference for torch_tensorrt. However for this particular issue you can easily make it work by recording the padding values for your images and setting them manually in a monkey patch.
Hey, we have run in the same issue a while back, timm is not very well implemented for inference for torch_tensorrt. However for this particular issue you can easily make it work by recording the padding values for your images and setting them manually in a monkey patch.
Sorry I am new to torch_tensorrt, can you give an example on how to patch the example script in the issue body?
Probably a relevant issue, I encounter another error when trying to run Torch-TensorRT with torchvision models: https://github.com/pytorch/vision/issues/5378. Since Torch-TensorRT only builds against the latest stable PyTorch release, I don't test it on the nightly version.
So, here the problem seems to be that the function aten::ceil.float
is not supported by Torch-TensorRT so you want to find a way to work around that.
An easy solution is to install timm
in an NGC container. Using pip it will be installed in /opt/lib/python3.8/site-packages/timm
.
You want to modify the function, get_same_padding
from /opt/lib/python3.8/site-packages/timm/models/layers/padding.py
so that it does not use aten::ceil.float
.
The quickest way to do that is to replace it by a dict and then modify the functions you need that call get_same_padding
to use the dict you have just created.
Thanks @MatthieuTPHR! We are developing a benchmark suite that compares different PyTorch TensorRT libraries (such as onnx2trt, torch_tensorrt, torch2trt, etc), and timm is one of our upstream model repository. We would like not to change the model code unless the patch is accepted by the upstream repo (in this case, it is timm).
Is there a plan when Torch-TensorRT will support aten::ceil.float
?
A related issue is https://github.com/NVIDIA/Torch-TensorRT/issues/890, where we also find correctness issues with timm and Torch-TensorRT.
This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days
This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days
[Removed]
This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days