quantized.pytorch Straight through estimator

Straight through estimator

Open michaelklachko opened this issue 6 years ago • 0 comments

I noticed that you don't cancel gradient of the large values, when using straight through estimator here.

In QNN paper it was claimed "Not cancelling the gradient when r is too large significantly worsens performance".

Does it only matter for low precision quantization (e.g. binary?)

Sep 30 '18 23:09 michaelklachko