AMDMIGraphX issues

Scale is 0 for quantizelinear and dequantizelinear

Problem: Using a scale of 0 in dequantizelinear is incorrect- it makes the quantized input irrelevant. The optimization passes identify this as dead code and eliminate it. A proper dequantization...

aarushjain29

bug

Fuse Gemm-Elementwise-Gemm with mlir

Since mlir supports this fusion we need to update `fuse_mlir` to find such patterns and fuse them. For now we should probably have a flag to enable the fusion until...

pfultz2

Update CK commit hash

This PR updates CK commit hash after merging https://github.com/ROCm/composable_kernel/pull/2552

slojosic-amd

Measure when MLIR is faster or slower to update the heuristic

Gather mxr/pythons files from all the models including for NCHW/NHWC/fp16/bf16/fp32 and rewrite_dot/on/off. Measure the performance for each file with and without MLIR on all the different hardware. Make a list...

pfultz2

Set attribute to help bypass the warning about amdgpu_waves_per_eu

2

The upgraded tool chain is giving a new compile warning that needs to be bypassed for the topk test to successfully compile, and run. > [ RUN ] test_topk /tmp/comgr-d7a292/input/main.cpp:11:22:...

lakhinderwalia

Make Onnx resize ALWAYS parse to a Resize op

Low priority enhancement to [Issue 1670](https://github.com/ROCm/AMDMIGraphX/issues/1670): `make_gather_instruction()` function calls in `parse_resize.cpp` can be replaced by invoking the Resize op. as follows: /* TODO: to make Onnx resize ALWAYS parse to...

bpickrel

enhancement

onnx

Python API - Hide GPU specific functions

* Currently the `allocate_gpu`, `to_gpu`, `from_gpu` and `gpu_sync` functions are exposed through the python API. * We should make it such that these functions are hidden behind the `target.copy_to()` and...

CharlieL7

enhancement

TedThemistokleous

onnx

Onnx Operators

AMDMIGraphX
AMDMIGraphX copied to clipboard

Metadata

Scale is 0 for quantizelinear and dequantizelinear

Fuse Gemm-Elementwise-Gemm with mlir

Update CK commit hash

Measure when MLIR is faster or slower to update the heuristic

Set attribute to help bypass the warning about amdgpu_waves_per_eu

Make Onnx resize ALWAYS parse to a Resize op

Python API - Hide GPU specific functions

Refactor gemm fusion paths to allow benchmarking CK, MLIR, and rocBLAS together during compilation

Improve fusions around reshape/transpose/broadcast

Add Sequence Operators

← Metadata

Owner

Metadata

AMDMIGraphX AMDMIGraphX copied to clipboard

Metadata

← Metadata

Owner

Metadata

AMDMIGraphX
AMDMIGraphX copied to clipboard