pangr comments

Results 11 comments of


                                            pangr

Do I have to do PTQ before QAT with pytorch_quantization toolkit?

> In the only example provided in the toolkit, it loaded the PTQ calibrated weights and did the QAT based on it. There isn't a standalone QAT example without PTQ....

What does “Reformatting CopyNode for Input Tensor” mean in trtexec' dump profile

> By default, TRT assumes that the network inputs/outputs are in FP32 linear (i.e. NCHW) format. However, many tactics in TRT require different formats, like NHWC8 or NC/32HW32 formats, so...

What does “Reformatting CopyNode for Input Tensor” mean in trtexec' dump profile

> https://github.com/NVIDIA/TensorRT/tree/main/tools/onnx-graphsurgeon and https://github.com/NVIDIA/TensorRT/tree/main/tools/onnx-graphsurgeon/examples/04_modifying_a_model Sorry, it doesn't work, Is there any inference in pytorch_quantization, I just doesn't want to quantify “Gemm”

Does TensoRT support the fusion of leayrelu and conv?

> or create a case and run it with trtexec --verbose, you are able to see the final engine structure in the log. which will tell if TRT can support...

Does TensoRT support the fusion of leayrelu and conv?

Does TensorRT support leakyrelu quantization?

Does TensoRT support the fusion of leayrelu and conv?

> Whether the fusion happens depends on whether TRT has tactics supporting that. The very rough guidelines are: > > * Conv+LeakyReLU should be fused in FP16 or in INT8...

[BUG]After using the code that supports llama inference, the result of the inference is different from the original one

When I asked "Who is founder of goolge.com?", the result of llama13B answered as shown in the figure below: “tro tro tro tro tro tro tro tro tro tro tro...

Support W8A8 inference in vllm

> We have implemented W8A8 inference in vLLM, which can achieve a 30% improvement in throughput. W4A16 quantization methods require weights to be dequantized into fp16 before compute and lead...

[BUG]vits和vqgan在自定义数据集上训练后的效果不佳

> 最好是加点别的说话人, 数据太少训练很容易 model collapse 和灾难性遗忘我加了数据集aishell3中的训练集一起训练，并且冻住了MRTE的参数，训练了300000steps，效果还是一样，请问是有什么参数配置需要调整吗

mini-monky学习率设置问题

> Hi~, 这里的训练是为了让模型能接受论文提出的MSAC的输入，但是我们没有这么多的数据去训练模型，所以学习率设置比较小，让模型学习到MSAC的输入，而不破坏预训练好的能力明白，感谢回复，还有个问题是mini-monkey的整个预训练中都只训练了文本大模型、没有训练视觉模块和模态对齐的模块吗