AMDMIGraphX icon indicating copy to clipboard operation
AMDMIGraphX copied to clipboard

Missing constant propagation: `Literal` -> `Multibroadcast` -> `Quantizelinear`

Open CharlieL7 opened this issue 1 year ago • 0 comments

  • Found during Inference Model Review meeting
  • Seen in bert_base_cased and distilgpt2_fp16 run with our --fp8 flag and probably also --int8
@12 = hip::hip_copy_literal[id=main:@literal:17] -> half_type, {768, 2304}, {2304, 1}
@13 = load[offset=188743680,end=190513152](@1) -> fp8e4m3fnuz_type, {64, 768, 2304}, {0, 2304, 1}
@14 = multibroadcast[out_lens={64, 768, 2304},out_dyn_dims={}](@12) -> half_type, {64, 768, 2304}, {0, 2304, 1}
@15 = gpu::code_object[code_object=5088,symbol_name=quantizelinear_kernel,global=113246208,local=1024,](@14,@13) -> fp8e4m3fnuz_type, {64, 768, 2304}, {0, 2304, 1}
  • Example from distilgpt2_fp16
    • driver command: bin/driver perf /codes/distilgpt2_1_fp16_gpu.onnx --fp8 --fill1 input_ids --input-dim @input_ids 64 384 --batch 64
  • The input to instruction @15 quantizelinear is a broadcasted literal. The broadcast instruction should have been swapped by the find_inner_broadcasts matcher in the simplify_algebra compiler pass to then allow the propagate_constant pass to make it into a constant.

CharlieL7 avatar Nov 07 '24 22:11 CharlieL7