CC0000000

Results 1 comments of CC0000000

(I can be wrong, but anyway) According to https://github.com/microsoft/unilm/blob/master/bitnet/The-Era-of-1-bit-LLMs__Training_Tips_Code_FAQ.pdf During training, x and w are scaled, quantized, and then rescaled back to the original scale. (Probably due to how the...