Dheeraj Peri

Results 121 comments of Dheeraj Peri

We currently use NVIDIA Model optimizer toolkit which inserts quantization nodes within the torch model using quantize API 1) https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/9c54aa1c47871d0541801a20962996461d805162/modelopt/torch/quantization/model_quant.py#L126 2) https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/9c54aa1c47871d0541801a20962996461d805162/modelopt/torch/quantization/tensor_quant.py#L229-L243 (definition of custom ops which do the quantization)....

Thanks @junstar92 for the contribution. Instead of modifying the FX path, we should import these utilities from the dynamo path since it is actively being developed. So, instead can you...

Also, @junstar92 please rebase with main. Some of the CI failures should be resolved

The PR 3513 is merged. Please try with the latest and 25.05 should have it. Reopen incase this exists.

Hello @yjjinjie Is this still an issue ? If so, can you point me to the exact model `Matmul` or `Matmul2` in https://github.com/pytorch/TensorRT/issues/3127#issuecomment-2325512369 which is the issue ?

Model device budget: - Include pytorch subgraph weights as well in the device budget (count parameters and estimate size) - Conclusion is estimate new budget by subtracting pytorch submodule sizes....

Thanks for filing this @broken-dream I was able to reproduce this with torch 2.5 nightly. It works with Pytorch 2.4 so this is a regression. We will investigate further.

Hello @kacper-kleczewski , I tried your script with the main branch (which is on `2.5.0.dev20240822+cu124`) and it works fine. Here's the slightly modified script that I tried ```py import torch...