Manu Mathew comments

Results 215 comments of


                                            Manu Mathew

the quantize training, is only decrease the loss of activation quantize loss without weights quantize loss?

Hi, As you can see in the method merge_quantize_weights(), the parameters are quantized. So both parameters and activations are quantized. But both effects parameter quantization and activation quantization are removed...

the quantize training, is only decrease the loss of activation quantize loss without weights quantize loss?

Interesting discussion. Lets continue. You said: we should correct the conv layer's w to wq I think it is clear that the weights used in forward in this code is...

the quantize training, is only decrease the loss of activation quantize loss without weights quantize loss?

Let me write it step by step and you tell me which step do you thing there should be a modification: 1. merge_quantize_weights will quantize the weights w to wq....

the quantize training, is only decrease the loss of activation quantize loss without weights quantize loss?

From the paper that you quoted, section 3.2: "However, **we maintain weights in floating point** and update them with the gradient updates. This ensures that minor gradient updates gradually update...

the quantize training, is only decrease the loss of activation quantize loss without weights quantize loss?

Okay. It is clearer now. So what you are saying is that backward computation should use wq,yq (but update w as per the paper). But instead what is happening is...

the quantize training, is only decrease the loss of activation quantize loss without weights quantize loss?

Interesting! Thanks for pointing this out - something to think about.

the quantize training, is only decrease the loss of activation quantize loss without weights quantize loss?

I have cleaned up the implementation of STE a bit for better understanding. If you have a float tensor "y" and a fake quantized tensor "yq", then to do STE,...

Bounding box information in output tensor of onnx model

If you export the onnx graph using original https://github.com/open-mmlab/mmdetection it results in a complicated graph for the final detection portion after the convolution layers. However, we have represented all the...

pytorch-jacinto-ai-devkit train time problem

The Quantization simulation required for QAT is done in Pytorch code. This may be the reason for slowness. It will be faster if it is done in the underlying C++...

pytorch-jacinto-ai-devkit train time problem

I believe there is some mistake. But this will not change the speed. QAT training will be slower than regular training. For QAT training you need to give the model...