AMDMIGraphX issues

[MLIR][Attention] Implement gemm(i8)-dequantizelinear-softmax(fp16)-gemm(fp16) lowering

### Problem Description This ticket is to implement gemm(i8)-dequantizelinear-softmax(fp16)-gemm(fp16) pattern to do a partial i8 attention kernel in rocmlir. Here is one of the examples test we currently have working...

manupak

Integrate hipblaslt calls in gemm op

4

mqhc2020

use mlir for smaller 3x3 kernels

3

umangyadav

[Tracking] Integration of the rocMLIR SplitK GEMM scheme to MIGraphX

1

This is a **tracking** issue and is supposed to synchronize both rocMLIR and MIGraphX teams regarding the SplitK GEMM scheme integration. The overall design is based on the [proposal](https://github.com/ROCm/AMDMIGraphX/discussions/2858). ##...

ravil-mobile

roadmap

Use splik reductions for large reductions

pfultz2

Split pointwise from mlir op when splitk is being used

When compiling for gemm using splitk for MLIR, split the pointwise operators and generate code object using our regular pointwise code generation.

pfultz2

ONNX + Torch MIGRaphX SDXL Script

shivadbhavsar

Print types in abbreviated form

For the detailed perf report we can print the types as abbreviated types so we can attach to each tensor size: * double_type -> f64 * float_type -> f32 *...

pfultz2

Keep LayerNorm accumulator at FP32

When a model is quantized to FP16 LayerNorm is also quantized. This leads to an accuracy problem. Make the code changes needed to hold LayerNorm as always FP32 accumulation. Then...

causten

Improve eliminate_concat

We want to be able to insert `copy` operator for some concat cases where: * Almost all the inputs are a precompile_op * Only 1 copy is needed To do...

pfultz2

AMDMIGraphX
AMDMIGraphX copied to clipboard

Metadata

[MLIR][Attention] Implement gemm(i8)-dequantizelinear-softmax(fp16)-gemm(fp16) lowering

Integrate hipblaslt calls in gemm op

use mlir for smaller 3x3 kernels

[Tracking] Integration of the rocMLIR SplitK GEMM scheme to MIGraphX

Use splik reductions for large reductions

Split pointwise from mlir op when splitk is being used

ONNX + Torch MIGRaphX SDXL Script

Print types in abbreviated form

Keep LayerNorm accumulator at FP32

Improve eliminate_concat

← Metadata

Owner

Metadata

AMDMIGraphX AMDMIGraphX copied to clipboard

Metadata

← Metadata

Owner

Metadata

AMDMIGraphX
AMDMIGraphX copied to clipboard