AMDMIGraphX
AMDMIGraphX copied to clipboard
AMD's graph optimization engine.
Such an operator appears in LLM models quantized to int4 (also with GroupQueryAttention nodes), via the genai tool. Only N=4 needs to be supported in near term (i.e. 4 bits)...
During tier 1 model testing on Navi4, a verification failure was encountered with the BERT large model. The issue also occurs on MI100. Commands to reproduce failure: - migraphx-driver verify...
https://github.com/ROCm/AMDMIGraphX/pull/3319/files#r1715760668 MLIR related rewrites should be part of single function.
Check this comment : https://github.com/ROCm/AMDMIGraphX/pull/3319/files#r1720104814 That test is doing `conv + reduce_sum`. `reduce_sum` is on multiple axes. Therefore it is reshaped to make it on single axes.
Add int4 & uint4 types to MigraphX
https://github.com/ROCm/AMDMIGraphX/pull/2875 Introduced a workaround for a bug in rocBLAS. It should be removed when fix is available from rocBLAS in released ROCm.