Qwen2-0.5B-Instruct-GPTQ-Int4 如何输出最后隐藏层的权重参数
您好,想问下Qwen2-0.5B-Instruct-GPTQ-Int4 如何输出最后隐藏层的权重参数?我用automodel 加载 量化模型时报错:
代码:
llm = AutoModel.from_pretrained(model_name, trust_remote_code=True, torch_dtype="auto").to('cuda')
抛出异常:
ValueError: Block pattern could not be match. Pass block_name_to_quantize argument in quantize_model
Hi, I'm not sure what you intended to do. please clarify.
- "the last hidden layer": do you mean the langauge model head or the last transformer layer (including attention, ffn, and other stuff)
- if you are not aware, GPTQ is a weight quantization method, and an orginial weight matrix in an nn.Linear corresponds to 4 parameters after quantization: the q_weight, the q_idx, the zeros, and the scales. together, they represent a weight matrix. what do you mean by parameter weight?
This issue has been automatically marked as inactive due to lack of recent activity. Should you believe it remains unresolved and warrants attention, kindly leave a comment on this thread.
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.