AQLM
AQLM copied to clipboard

Published 20 hours ago •

Reame
Issues

question about the finetune

Open LiMa-cas opened this issue 5 months ago • 6 comments

is the finetune need each layer？ could I used for some layers finetune once？
is codebook quantized method is slower than AWQ?
when I inference，it is successful when max_new_tokens=512， but failed when max_new_tokens=2048

Sep 23 '24 02:09 LiMa-cas