OmniQuant
OmniQuant copied to clipboard
OPT-30B
"
CUDA_VISIBLE_DEVICES=0 python main.py
--model /home/Projects/model_zoo/facebook/opt-30b
--epochs 20 --output_dir ./log/opt-30b-w6a6
--wbits 6 --abits 6 --lwc --let --alpha 0.75 --eval_ppl
--net opt-30b
"
When I use omniquant to quantizate OPT-30B to w6a6, an error happens in omniquant.py: scale = (act.pow(args.alpha)/weight.pow(1-args.alpha)).clamp(min=1e-5)
" RuntimeError: The size of tensor a (7168) must match the size of tensor b (5120) at non-singleton dimension 0 "
I find the shape of act is [7168], but the shape of weight is [5120].