neural-compressor icon indicating copy to clipboard operation
neural-compressor copied to clipboard

Enable Llama MoE models' GPTQ quantization

Open YIYANGCAI opened this issue 1 year ago • 1 comments

Type of Change

new feature

YIYANGCAI avatar Mar 04 '24 02:03 YIYANGCAI

@YIYANGCAI please resolve the conflict, will this PR target v2.6 release?

chensuyue avatar May 21 '24 06:05 chensuyue

Hi @YIYANGCAI, I saw nn.Conv2d, nn.Conv1d are supported in GPT. Does that mean MOE have these two op types? I previously thought that only transformer.conv1d is required.

xin3he avatar May 27 '24 08:05 xin3he