Xiaodong (Vincent) Huang comments

Results 285 comments of


                                            Xiaodong (Vincent) Huang

Do I have to do PTQ before QAT with pytorch_quantization toolkit?

Hi @deephog , we recommend to do PTQ first, then doing the QAT to fine tune the weights using the fixed quant scale. This helps converge. In theory you can...

When run "python -m pip install colored polygraphy --extra-index-url https://pypi.ngc.nvidia.com", Can`t find install.ps1

Closing since no response for more than 3 weeks, please reopen if you still have question, thanks!

Mismatched type error when generating an engine for a quantized stereo-depth model

The `fnet.conv1.weight` is shared by multiple conv, and currently TRT cannot constant fold the weights that shared. Dynamic weights input support could fix this, this is already in TRT plan...

Mismatched type error when generating an engine for a quantized stereo-depth model

I created internal task 3741010 to track this issue. @deephog , could you wait for next major release? Thanks!

Mismatched type error when generating an engine for a quantized stereo-depth model

The mode (https://drive.google.com/file/d/1xJyU7CnVqzc8tBU0_ruewTD1fgxrz1EA/view?usp=sharing) will be fixed in 8.5EA, closing and thanks!

Is there a way to speed up collect_stats() method in pythorch quantization?

@gj-raza currently the quantization is implemented by a sequence of pytorch op, and this can be accelerated by using cuda extension. I will create internal feature request for this, thanks!

TensorRT8 int8 model convtranspose layer output error

Hello @zhangjoey115 , I assume you are using the `mark all` function in the polygraphy. first `mark all` can hidden some issue while debug accuracy, this is because without `mark...

TensorRT8 int8 model convtranspose layer output error

@zhangjoey115 , sorry for the delay response, do you share a simple repro onnx model? thanks!

How to convert .engine to .plan?

Closing since no response for more than 3 weeks, please reopen if you still have question, thanks!

INT8 inference in TensorRT 8.0 get wrong answer

Hello @Ricardosuzaku , the steps are correct. what's the accuracy when you run the pytorch_quantization toolkit? thanks