neural-compressor
neural-compressor copied to clipboard
Enable Llama MoE models' GPTQ quantization
Type of Change
new feature
@YIYANGCAI please resolve the conflict, will this PR target v2.6 release?
Hi @YIYANGCAI, I saw nn.Conv2d, nn.Conv1d are supported in GPT. Does that mean MOE have these two op types? I previously thought that only transformer.conv1d is required.