yuanqian_zhao

Results 21 comments of yuanqian_zhao

same issue, is this available?

Hi! any progress? is train LoRA modules with AWQ available now?

> Hi, I'm also interested to know whether LoRA + AWQ is already available now. Thanks! @RicardoHalak see this, is runnable https://github.com/huggingface/transformers/pull/28987

available now?I simply do gptq and awq on Yi-6B, and try to do lora training on it, however, loss is Nan.

@maxin9966 this may irrelevant to your questions, but I'm wondering in your code, for the chat model, why attention_mask is just `input_ids.ne(tokenizer.pad_token_id)`, maybe only calculating the response loss is better

Loss decreases when increasing the number of samples simply because the value printed on your screen is the loss divided by the number of samples.

> Hi, for the W4A16 configuration, most mainstream quantization algorithms can achieve good results. If using a per-channel setting for W4A16, you can choose any quantization algorithm you prefer. However,...

I tried applying the QoQ method on MiniCPM3-4B and tested the drop points for BBH/MMLU/Ceval/Cmmlu/Humanebval/Mbpp/Gsm8k/Math benchmarks. Ceval, Humaneval, Mbpp, and Gsm8k experienced about a 10 percentage point drop, while the...

@HandH1998 Yes, it is rotation+gptq, with evaluation based on transformers+ultraeval. Specifically, I applied lmquant (w4a8kv4, groupsize=32, only adding all R1 type rotation matrices declared in spinquant, no smooth involved) to...