torch2trt icon indicating copy to clipboard operation
torch2trt copied to clipboard

Problem with AdaptiveMaxPool2d, likely with stride/kernel_size

Open gabe-scorebreak opened this issue 2 years ago • 2 comments

I want to convert a model to trt but I have a problem with nn.AdaptiveMaxPool2d. It seems like it only works for specific kernel_size. In my case its (384, 384) and you guys do some tests only for (224, 224) as shown in adaptive_max_pool2d.py: @add_module_test(torch.float32, torch.device("cuda"), [(1, 3, 224, 224)])

So as I said my kernel_size is (384, 384) and the following line layer = ctx.network.add_pooling(input=input._trt, type=trt.PoolingType.MAX, window_size=kernel_size) in the converter file return None, hence layer.stride = stride fails, because layer does not exist so it has no attribute stride. Is there any way to fix this issue?

I would be down to writing my own converter if that's the case, however, the resources of how to do it are scarce. Can I instead somehow "skip" this converter? In the converter file I tried enabled=False in the decorator, however now something else seems to break, I get something like that instead:

779         outputs = module(*inputs)
    781         outputs_flat = output_flattener.flatten(outputs)
--> 782         ctx.mark_outputs(outputs_flat, output_names)
    784 # set max workspace size
    785 config.max_workspace_size = max_workspace_size

File [~/miniconda3/envs/pt2/lib/python3.10/site-packages/torch2trt-0.4.0-py3.10.egg/torch2trt/torch2trt.py:548], in ConversionContext.mark_outputs(self, torch_outputs, names)
    545 self.output_names = names
    547 for i, torch_output in enumerate(torch_outputs):
--> 548     trt_tensor = torch_output._trt
    549     trt_tensor.name = names[i]
    550     trt_tensor.location = torch_device_to_trt(torch_output.device)
...
   1073     return _size_wrapper(self)
   1074 else:
-> 1075     return _old_getattr(self, name)

AttributeError: 'Tensor' object has no attribute '_trt'

Can you help me out here on how to fix or skip that op? Like for example if you only support some certain kernel_sizes (let's say you only support multiple of 224 for the sake of the argument) can I somehow take my unsupported size 384 and pad it to 448 and then perform max pooling or something? For this particular shapes I tried it already in pytorch code and it doesn't work, layer is still None, so do you see any other such tricks I could perform?

gabe-scorebreak avatar May 29 '23 16:05 gabe-scorebreak

@jaybdub shamelessly tagging you to increase my chance of getting help 🙃

gabe-scorebreak avatar May 29 '23 16:05 gabe-scorebreak

@gabe-scorebreak this could be your problem: https://forums.developer.nvidia.com/t/parse-onnx-file-failed-parameter-check-failed-condition-alldimsgteq-windowsize-1-volume-windowsize-max-kernel-dims-product-nbspat/214310/4, TRT only support some certain amount of kernel_size.

I tried a workaround for this, not sure if this work for you or not:

def check_VolumeWindowSize(kernel_size, shape):

    if len(kernel_size) == 2:
        temp = 1
        for e in kernel_size:
            temp*=e
        if temp >= 100000:
            kernel_size = tuple(shape[2:])
    elif len(kernel_size) == 3:
        temp = 1
        for e in kernel_size:
            temp*=e
        if temp >= 100000000:
            kernel_size = tuple(shape[1:])

    return kernel_size
    
shape= input.shape
new_kernel_size = check_VolumeWindowSize(kernel_size, shape)

usama-baloch avatar Oct 02 '23 14:10 usama-baloch