AMDMIGraphX
AMDMIGraphX copied to clipboard
AMD's graph optimization engine.
disable const_folding for unpack_int4
- Implement a separate API that performs the graph construction for specific operators instead of it being done directly in the parser - Resolves https://github.com/migraphx-benchmark/AMDMIGraphX/issues/190
### Problem Description Seeing GPU fault when running the onnxruntime-inference-examples script using reduced layer bert models during benchmarking. It appears quantization/calibration steps work and the issue arises during inference. ```...
For the interger quantization, quantizelinear operation is ` nearbyint(x / scale) + zeropoint`. FP8 is floating point operation already, Is nearbyint necessary in that case ?
At the time of writing this issue, ONNX has FP8 support for the following ops. ONNX will keep adding FP8 support for more operators. - castlike - cast - constant...