neural-compressor
neural-compressor copied to clipboard
smooth quant pattern is incomplete at folding=True
for llama, 2 patterns have not been detected, mlp.down_proj->mlp.up_proj, .self_attn.o_proj->module.self_attn.v_proj
for opt, self_attn.out_proj->self_attn.v_proj