blog
blog copied to clipboard
Loading pre-quantized 8bit model
Just saw this from the integration blog:
Saving 8-bit state dicts on the Hub 8-bit state dicts cannot currently be loaded directly into the 8-bit model after being pushed on the Hub. This is due to the fact that the statistics (remember weight.CB and weight.SCB) computed by the model are not currently stored or taken into account inside the state dict, and the Linear8bitLt module does not support this feature yet. We think that having the ability to save that and push it to the Hub might contribute to greater accessibility.
Does this still hold with the latest version of bnb? When I play with the latest bnb, weight.SCB can be saved to the model weight