haq Regarding paper and codes

trafficstars

By diving deep into the codes and the paper, I have two questions.

I've read from the paper that "If the current policy exceeds our resource budget (on latency, energy or model size), we will sequentially decrease the bitwidth of each layer until the constraint is finally satisfied." Where in the codes correspond to this statement "decrease the bitwidth of the layer when the current policy exceeds budget?"
Why don't you use the k-means quantization for latency/energy constraint experiments? Will you release codes for linear quantization?

Sep 23 '19 21:09 yzhang93

Hi, I also find the second question. And Did you reappear the quantization method? I reappear the quantization method based on cifar10+resner20 as 3.4 of the paper; however, this linear quantization method didn't work.

Oct 27 '19 02:10 haibao-yu

I find that the codes using k-means quantization while in the paper it says find the optimal clip value to minimize the KL divergence between non-quantized and quantized weight/activation, in the paper it means the linear quantization, which is different as shown in the codes.

Dec 23 '19 05:12 lydiaji

I find that the codes using k-means quantization while in the paper it says find the optimal clip value to minimize the KL divergence between non-quantized and quantized weight/activation, in the paper it means the linear quantization, which is different as shown in the codes.

This confuses me as well. The paper uses linear quantization, but the code provides k-means quantization (similar to the "deep compression"). After k-means quantization, we cannot guarantee that the weights are fixed point arithmetic units.

Jan 01 '20 03:01 mepeichun

It's quite unfortunate that the main novelty claimed by the paper, i.e., the use of direct hardware feedback, is conveniently missing in this repo. In fact, even the paper failed to provide a clear explanation on that claim.

Mar 03 '20 19:03 lcmeng

We have updated the linear quantization as well as the hardware resource-constrained part in this repo. Please let us know if you have any questions.

May 05 '20 17:05 kuan-wang

Can you please point to the part where the direct HW feedback is used? Thanks. Without that, the repo is still quite limited in significance.

May 05 '20 18:05 lcmeng

Thanks for your feedback! You can view the related code refer to https://github.com/mit-han-lab/haq/blob/7141586e9ae47c8a50aa8b596ab37682a06b434a/lib/env/linear_quantize_env.py#L306

May 06 '20 07:05 kuan-wang

haq haq copied to clipboard

Regarding paper and codes

haq
haq copied to clipboard