onnxruntime MIGraphX Execution Provider: Stream Synchronization

Description: Changes to the MIGraphx execution provider code to allow for stream synchronization on the gpu side

Motivation and Context Performance boost by removing redundant host to device synchronizations

The current implementation of the execution provider continuously calls hipDeviceSynchronize() between computations which adds overhead and an idle wait between the GPU's computations. This is noticeable during device

This change leverages new functionality that's been added to MIGraphX to allow for GPU side synchronization which avoids the need for host->device waits.

To maintain backwards compatibility with older MIGraphX versions, the compile time define MIGRAPHX_STREAM_SYNC has been added to the API to allow for older version operate with newer builds of onnxruntime without loss of functionality to the current feature set as of (08/09/22)

Sep 08 '22 16:09 TedThemistokleous

All CLA requirements met.

Sep 08 '22 16:09 ghost

Hi @snnn, the MIGraphX side has completed our review process for the changeset for this and I just pushed up a change the other day that shouldn't break functionality between older versions of MIGraphX and Onnxruntime with changes to the EP.

These stream sync changes shall be reflected in our 5.4 release and earlier MIGraphX versions should function as if this wasn't added for now.

Can I get another run of the above workflow CI?

Please let me know of any further feedback/questions/concerns on this change set as well.

Oct 07 '22 14:10 TedThemistokleous

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux Nuphar CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline

Oct 10 '22 20:10 ytaous

/azp run Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows WebAssembly CI Pipeline, orttraining-amd-gpu-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed, onnxruntime-python-checks-ci-pipeline

Oct 10 '22 20:10 ytaous

Azure Pipelines successfully started running 8 pipeline(s).

Oct 10 '22 20:10 azure-pipelines[bot]

Azure Pipelines successfully started running 8 pipeline(s).

Oct 10 '22 20:10 azure-pipelines[bot]

/azp run Linux MIGraphX CI Pipeline

Oct 11 '22 08:10 PeixuanZuo

Azure Pipelines successfully started running 1 pipeline(s).

Oct 11 '22 08:10 azure-pipelines[bot]

2022-10-11T10:13:47.9337923Z /onnxruntime_src/onnxruntime/core/providers/migraphx/migraphx_execution_provider.cc:1064:31: error: ‘ctx’ was not declared in this scope 2022-10-11T10:13:47.9339811Z 1064 | auto input_tensor = ctx.GetInput(index);

Oct 11 '22 17:10 ytaous

/azp run Linux MIGraphX CI Pipeline

Oct 12 '22 00:10 ytaous

Azure Pipelines successfully started running 1 pipeline(s).

Oct 12 '22 00:10 azure-pipelines[bot]

Force pushed after a rebase off your main branch. Looks like I had a conflict with another upstream change that broke CI. Was able to reproduce and fix locally. Not sure if I need permission again for additional CI runs. I suppose we'll see in the morning.

Oct 12 '22 00:10 TedThemistokleous

/azp run Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows WebAssembly CI Pipeline, orttraining-amd-gpu-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed, onnxruntime-python-checks-ci-pipeline

Oct 12 '22 01:10 ytaous

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux Nuphar CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline

Oct 12 '22 01:10 ytaous

Azure Pipelines successfully started running 8 pipeline(s).

Oct 12 '22 01:10 azure-pipelines[bot]

Azure Pipelines successfully started running 8 pipeline(s).

Oct 12 '22 01:10 azure-pipelines[bot]

This pull request fixes 1 alert when merging ce6de4f9cd60245dd6cb59b69f405ec8083938ce into b2353fa737e3aa02a7fe99f8c52112443abf7c09 - view on LGTM.com

fixed alerts:

1 for Potentially uninitialized local variable

Oct 12 '22 02:10 lgtm-com[bot]

onnxruntime onnxruntime copied to clipboard

MIGraphX Execution Provider: Stream Synchronization

onnxruntime
onnxruntime copied to clipboard