AMDMIGraphX
AMDMIGraphX copied to clipboard
AMD's graph optimization engine.
Problem: Using a scale of 0 in dequantizelinear is incorrect- it makes the quantized input irrelevant. The optimization passes identify this as dead code and eliminate it. A proper dequantization...
Since mlir supports this fusion we need to update `fuse_mlir` to find such patterns and fuse them. For now we should probably have a flag to enable the fusion until...
This PR updates CK commit hash after merging https://github.com/ROCm/composable_kernel/pull/2552
Gather mxr/pythons files from all the models including for NCHW/NHWC/fp16/bf16/fp32 and rewrite_dot/on/off. Measure the performance for each file with and without MLIR on all the different hardware. Make a list...
The upgraded tool chain is giving a new compile warning that needs to be bypassed for the topk test to successfully compile, and run. > [ RUN ] test_topk /tmp/comgr-d7a292/input/main.cpp:11:22:...
Low priority enhancement to [Issue 1670](https://github.com/ROCm/AMDMIGraphX/issues/1670): `make_gather_instruction()` function calls in `parse_resize.cpp` can be replaced by invoking the Resize op. as follows: /* TODO: to make Onnx resize ALWAYS parse to...
* Currently the `allocate_gpu`, `to_gpu`, `from_gpu` and `gpu_sync` functions are exposed through the python API. * We should make it such that these functions are hidden behind the `target.copy_to()` and...
Add sequence operator support for Onnx OPs Initial work started for SplittToSequence in #1366 Need to support other sequence type ops