TensorRT icon indicating copy to clipboard operation
TensorRT copied to clipboard

🐛 [Bug] None as const doesn't work

Open naor2013 opened this issue 2 years ago • 7 comments

Bug Description

I tried to convert a scripted module to torch-tensorrt but it includes None as a const and it seems like that is not supported.

To Reproduce

Take a module with None as a variable in it, use torch.jit.script on it and then try to compile it with torch-tensorrt.

The error I got: RuntimeError: 0INTERNAL ASSERT FAILED at "../torch/csrc/jit/ir/alias_analysis.cpp":607, please report a bug to PyTorch. We don't have an op for trt::const but it isn't a special case. Argument types: NoneType,

Candidates: trt::const(Tensor val) -> (Tensor)

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

  • Torch-TensorRT Version (e.g. 1.0.0): compiled from master
  • PyTorch Version (e.g. 1.0): 1.11
  • CPU Architecture: Don't know
  • OS (e.g., Linux): Linux
  • How you installed PyTorch (conda, pip, libtorch, source): pip
  • Build command you used (if compiling from source):
  • Are you using local sources or building from archives:
  • Python version: 3.8
  • CUDA version: 11.3
  • GPU models and configuration: T4
  • Any other relevant information:

naor2013 avatar Apr 16 '22 18:04 naor2013

Can you provide a reproducing example?

narendasan avatar Apr 16 '22 21:04 narendasan

@narendasan I did some more digging and I found where the issue is but I can't seem to reproduce it when I try to do a minimal example.

Basically in the linear_to_addmm there is a path for bias=None/False and other. It seems like for some reason, in a model where I have many linear layers and only one of them has bias=False, it tries to convert it as if it does have bias, causing the situation where const(none) exists.

When I try to reproduce it with a simple module with a linear layer, it works as expected and I don't see the issue.

What can I check to see why it happens in my specific model (sharing the model is not possible unfortunately)?

naor2013 avatar Apr 17 '22 00:04 naor2013

If you turn on graph mode:

with torch_tensorrt.logging.graphs():
    my_trt_mod = torch_tensorrt.compile(my_model, ...)

You will get the graphs at each stage of lowering. We should log the transformations including pre and post linear_to_addmm

narendasan avatar Apr 17 '22 23:04 narendasan

@narendasan I did that, I can see that the linear was getting None on the bias parameter and yet torch-tensorrt did the transformation as if it is a regular linear without a "None". I just don't know why that happens.

naor2013 avatar Apr 18 '22 07:04 naor2013

@narendasan I did that, I can see that the linear was getting None on the bias parameter and yet torch-tensorrt did the transformation as if it is a regular linear without a "None". I just don't know why that happens.

Hi, I have met a similar problem recently. Have you figured out how to solve it now?

geekinglcq avatar Jun 06 '22 07:06 geekinglcq

@narendasan I did that, I can see that the linear was getting None on the bias parameter and yet torch-tensorrt did the transformation as if it is a regular linear without a "None". I just don't know why that happens.

Hi, I have met a similar problem recently. Have you figured out how to solve it now?

No. I couldn't create a small reproducible example. In every example I tried to create it worked, but in my use case it didn't. If you can make a reproducible example, it might help the developers to solve this.

naor2013 avatar Jun 06 '22 09:06 naor2013

@narendasan I did that, I can see that the linear was getting None on the bias parameter and yet torch-tensorrt did the transformation as if it is a regular linear without a "None". I just don't know why that happens.

Hi, I have met a similar problem recently. Have you figured out how to solve it now?

No. I couldn't create a small reproducible example. In every example I tried to create it worked, but in my use case it didn't. If you can make a reproducible example, it might help the developers to solve this.

I found a workaround that might be helpful.

  1. Firstly, check all the modules with bias, and select the modules with False/None bias.
  2. Secondly, assign the bias with zeros, e.g. model.decoder.xxxx.bias = torch.nn.Parameter(torch.zeros([yyy_shape])), before compiling the model.
  3. 🎉 , problem solved (temporarily) in my case.

geekinglcq avatar Jun 07 '22 02:06 geekinglcq

This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days

github-actions[bot] avatar Sep 06 '22 00:09 github-actions[bot]