LiMa-cas

Results 7 issues of LiMa-cas

when I reference, is it much slower since I need if else to see which precision to dequantize?

Hi,What‘s the difference between llm-awq and autoawq?thanks in advance!!!

hello, how much time need and what datasets are u used?

1. is the finetune need each layer? could I used for some layers finetune once? 2. is codebook quantized method is slower than AWQ? 3. when I inference,it is successful...

![image](https://github.com/user-attachments/assets/7a8e2b8a-3b61-49f8-8602-e579963850df) torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 28.00 GiB. GPU 0 has a total capacity of 47.54 GiB of which 9.50 GiB is free. Process 1509125 has 9.68...