TensorRT
TensorRT copied to clipboard
🐛 [Bug] None as const doesn't work
Bug Description
I tried to convert a scripted module to torch-tensorrt but it includes None as a const and it seems like that is not supported.
To Reproduce
Take a module with None as a variable in it, use torch.jit.script on it and then try to compile it with torch-tensorrt.
The error I got: RuntimeError: 0INTERNAL ASSERT FAILED at "../torch/csrc/jit/ir/alias_analysis.cpp":607, please report a bug to PyTorch. We don't have an op for trt::const but it isn't a special case. Argument types: NoneType,
Candidates: trt::const(Tensor val) -> (Tensor)
Environment
Build information about Torch-TensorRT can be found by turning on debug messages
- Torch-TensorRT Version (e.g. 1.0.0): compiled from master
- PyTorch Version (e.g. 1.0): 1.11
- CPU Architecture: Don't know
- OS (e.g., Linux): Linux
- How you installed PyTorch (
conda
,pip
,libtorch
, source): pip - Build command you used (if compiling from source):
- Are you using local sources or building from archives:
- Python version: 3.8
- CUDA version: 11.3
- GPU models and configuration: T4
- Any other relevant information:
Can you provide a reproducing example?
@narendasan I did some more digging and I found where the issue is but I can't seem to reproduce it when I try to do a minimal example.
Basically in the linear_to_addmm there is a path for bias=None/False and other. It seems like for some reason, in a model where I have many linear layers and only one of them has bias=False, it tries to convert it as if it does have bias, causing the situation where const(none) exists.
When I try to reproduce it with a simple module with a linear layer, it works as expected and I don't see the issue.
What can I check to see why it happens in my specific model (sharing the model is not possible unfortunately)?
If you turn on graph mode:
with torch_tensorrt.logging.graphs():
my_trt_mod = torch_tensorrt.compile(my_model, ...)
You will get the graphs at each stage of lowering. We should log the transformations including pre and post linear_to_addmm
@narendasan I did that, I can see that the linear was getting None on the bias parameter and yet torch-tensorrt did the transformation as if it is a regular linear without a "None". I just don't know why that happens.
@narendasan I did that, I can see that the linear was getting None on the bias parameter and yet torch-tensorrt did the transformation as if it is a regular linear without a "None". I just don't know why that happens.
Hi, I have met a similar problem recently. Have you figured out how to solve it now?
@narendasan I did that, I can see that the linear was getting None on the bias parameter and yet torch-tensorrt did the transformation as if it is a regular linear without a "None". I just don't know why that happens.
Hi, I have met a similar problem recently. Have you figured out how to solve it now?
No. I couldn't create a small reproducible example. In every example I tried to create it worked, but in my use case it didn't. If you can make a reproducible example, it might help the developers to solve this.
@narendasan I did that, I can see that the linear was getting None on the bias parameter and yet torch-tensorrt did the transformation as if it is a regular linear without a "None". I just don't know why that happens.
Hi, I have met a similar problem recently. Have you figured out how to solve it now?
No. I couldn't create a small reproducible example. In every example I tried to create it worked, but in my use case it didn't. If you can make a reproducible example, it might help the developers to solve this.
I found a workaround that might be helpful.
- Firstly, check all the modules with bias, and select the modules with False/None bias.
- Secondly, assign the bias with zeros, e.g.
model.decoder.xxxx.bias = torch.nn.Parameter(torch.zeros([yyy_shape]))
, before compiling the model. - 🎉 , problem solved (temporarily) in my case.
This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days