torch2trt
torch2trt copied to clipboard
Problem with AdaptiveMaxPool2d, likely with stride/kernel_size
I want to convert a model to trt but I have a problem with nn.AdaptiveMaxPool2d. It seems like it only works for specific kernel_size. In my case its (384, 384) and you guys do some tests only for (224, 224) as shown in adaptive_max_pool2d.py: @add_module_test(torch.float32, torch.device("cuda"), [(1, 3, 224, 224)])
So as I said my kernel_size is (384, 384) and the following line
layer = ctx.network.add_pooling(input=input._trt, type=trt.PoolingType.MAX, window_size=kernel_size) in the converter file return None, hence layer.stride = stride fails, because layer does not exist so it has no attribute stride. Is there any way to fix this issue?
I would be down to writing my own converter if that's the case, however, the resources of how to do it are scarce. Can I instead somehow "skip" this converter? In the converter file I tried enabled=False in the decorator, however now something else seems to break, I get something like that instead:
779 outputs = module(*inputs)
781 outputs_flat = output_flattener.flatten(outputs)
--> 782 ctx.mark_outputs(outputs_flat, output_names)
784 # set max workspace size
785 config.max_workspace_size = max_workspace_size
File [~/miniconda3/envs/pt2/lib/python3.10/site-packages/torch2trt-0.4.0-py3.10.egg/torch2trt/torch2trt.py:548], in ConversionContext.mark_outputs(self, torch_outputs, names)
545 self.output_names = names
547 for i, torch_output in enumerate(torch_outputs):
--> 548 trt_tensor = torch_output._trt
549 trt_tensor.name = names[i]
550 trt_tensor.location = torch_device_to_trt(torch_output.device)
...
1073 return _size_wrapper(self)
1074 else:
-> 1075 return _old_getattr(self, name)
AttributeError: 'Tensor' object has no attribute '_trt'
Can you help me out here on how to fix or skip that op? Like for example if you only support some certain kernel_sizes (let's say you only support multiple of 224 for the sake of the argument) can I somehow take my unsupported size 384 and pad it to 448 and then perform max pooling or something? For this particular shapes I tried it already in pytorch code and it doesn't work, layer is still None, so do you see any other such tricks I could perform?
@jaybdub shamelessly tagging you to increase my chance of getting help 🙃
@gabe-scorebreak this could be your problem: https://forums.developer.nvidia.com/t/parse-onnx-file-failed-parameter-check-failed-condition-alldimsgteq-windowsize-1-volume-windowsize-max-kernel-dims-product-nbspat/214310/4, TRT only support some certain amount of kernel_size.
I tried a workaround for this, not sure if this work for you or not:
def check_VolumeWindowSize(kernel_size, shape):
if len(kernel_size) == 2:
temp = 1
for e in kernel_size:
temp*=e
if temp >= 100000:
kernel_size = tuple(shape[2:])
elif len(kernel_size) == 3:
temp = 1
for e in kernel_size:
temp*=e
if temp >= 100000000:
kernel_size = tuple(shape[1:])
return kernel_size
shape= input.shape
new_kernel_size = check_VolumeWindowSize(kernel_size, shape)