ReCU icon indicating copy to clipboard operation
ReCU copied to clipboard

a little question about information entropy

Open XA23i opened this issue 1 year ago • 1 comments

i am wondering in your paper why use latent full precision weights to calculate information entropy rather than binarized weights? It seems make no sense considering latent weights.

XA23i avatar Apr 26 '23 03:04 XA23i

Hello XA23i,

I'm not the author of this research, but I believe the reasoning behind it is related to the binarization of weights using the sign function. This process ensures that the weights maintain a specific statistical distribution throughout training. According to the authors, this final distribution of the weights follows a Laplacian pattern. Then, calculating the information entropy allows us to manipulate that distribution to achieve a higher or lower entropy.

Enmartz avatar May 25 '24 12:05 Enmartz