GPTQ-for-LLaMa icon indicating copy to clipboard operation
GPTQ-for-LLaMa copied to clipboard

About the fine-grained of weight quantization

Open xingyueye opened this issue 2 years ago • 0 comments

Hi, I'm confused about the fine-grained of weight quantization. For example, give a weights W with size of [4096, 4096], and the groupsize is 128. We perform per-channel quantization, hoping to obtain a scale of dimension [4096,] but instead we get a scale of dimension [4096, 32]. It is difficult to decode. Who could help me about this confusion

xingyueye avatar May 10 '23 10:05 xingyueye