Qwen3 Qwen2-0.5B-Instruct-GPTQ-Int4 如何输出最后隐藏层的权重参数

您好，想问下Qwen2-0.5B-Instruct-GPTQ-Int4 如何输出最后隐藏层的权重参数？我用automodel 加载量化模型时报错：代码： llm = AutoModel.from_pretrained(model_name, trust_remote_code=True, torch_dtype="auto").to('cuda') 抛出异常： ValueError: Block pattern could not be match. Pass block_name_to_quantize argument in quantize_model

Jul 23 '24 13:07 samosun

Hi, I'm not sure what you intended to do. please clarify.

"the last hidden layer": do you mean the langauge model head or the last transformer layer (including attention, ffn, and other stuff)
if you are not aware, GPTQ is a weight quantization method, and an orginial weight matrix in an nn.Linear corresponds to 4 parameters after quantization: the q_weight, the q_idx, the zeros, and the scales. together, they represent a weight matrix. what do you mean by parameter weight?

Jul 25 '24 07:07 jklj077

This issue has been automatically marked as inactive due to lack of recent activity. Should you believe it remains unresolved and warrants attention, kindly leave a comment on this thread.

Aug 24 '24 08:08 github-actions[bot]

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

Feb 25 '25 08:02 github-actions[bot]