AQLM
AQLM copied to clipboard
Actual bitrate of models on github?
Are the models you report in your readme supposed to be actual 2 bit models or just 2.x bit models? For example, the two 7B models below are both larger than a 2 bit decoder model, which would take 2.1G on disk. Also, why is there such a large difference in size between the 1x16 and 2x8 models? The size gap is much larger than the codebook size delta should take. Are you using different groupsizes in each model? Thanks