onnxruntime [QNN EP] Dump UT artifacts for debugging

Description

This change introduces new environment variables to make debugging failed unit tests easier by saving input/output artifacts:

ORT_UNIT_TEST_DUMP_ARTIFACTS: Dumps inputs, outputs, and expected outputs for test cases.
ORT_UNIT_TEST_ARTIFACTS_DIR: Sets the directory where artifacts are stored. Defaults to the current working directory if not provided.

Examples

Example 1: Using ORT_UNIT_TEST_DUMP_ARTIFACTS PowerShell: $Env:ORT_UNIT_TEST_DUMP_ARTIFACTS = "1";

C:.
├───QnnCPUBackendTests_DISABLED_Gemm_Broadcast_Bias_DynamicA_StaticB_DynamicC  # TestModelAccuracy 
│   ├───CPU_EP_outputs
│   │       output.pb
│   │
│   ├───inputs
│   │       input.pb
│   │       input_token_0.pb
│   │
│   └───QNNExecutionProvider_outputs
│           output.pb
│
└───QnnHTPBackendTests_DISABLED_BinaryOp_Div4D_LargeInputs  # TestQDQModelAccuracy 
    ├───CPU_EP_F32_outputs
    │       output.pb
    │
    ├───CPU_EP_QDQ_outputs
    │       output.pb
    │
    ├───inputs
    │       input.pb
    │       input_token_5.pb
    │
    └───QNN_EP_QDQ_outputs
            output.pb

Nov 26 '25 05:11 yuhuchua-qti

I believe something similar was implemented in https://github.com/microsoft/onnxruntime/pull/26396. if additional functionality is needed, can the infrastructure from that PR be extended?

Dec 01 '25 19:12 edgchen1

Hi @edgchen1, The functionality introduced in this PR is not specific to QNN-EP. It provides the ability to dump input and output data (input, output, and expected output) from the common accuracy tests in test_utils.cc, which can benefit all ORT providers when debugging accuracy issues. PR https://github.com/microsoft/onnxruntime/pull/26396, which is already merged, only updated QNN-EP-specific logic to dump QNN-specific materials such as DLC files. If we were to add this functionality by modifying only QNN-EP-specific logic, we would likely need to re-implement the RunAndVerifyOutputsWithEP() function within QNN-EP, resulting in duplicated work. For this reason, we chose to add the dumping functionality in the common accuracy test framework, so it can also benefit other ORT providers. Thanks.

Dec 03 '25 04:12 yuhuchua-qti

I see, I missed the update to onnxruntime/test/util/test_utils.cc.

are you aware of the debug node I/O dumping infrastructure? I'm wondering whether that would work, or whether it is worth having an additional input/output dumping mechanism.

Dec 03 '25 20:12 edgchen1

Hi @edgchen1 ,

I was not aware of the existing I/O dumping infrastructure earlier, but I have now tried this functionality and confirmed that it successfully dumps input and output data (including input, output, and expected output) from accuracy tests. To better leverage this infrastructure in the current ORT, I have one discussion point:

Update logic in debug_node_inputs_outputs_utils.cc In accuracy unit tests, the UT model may be executed two or three times with different EPs (e.g., ORT CPU and QNN HTP). Currently, the naming logic for dumped tensors is identical if the tensor names match across different inference sessions, which leads to file overwriting. To avoid this, we propose updating the naming logic in the MakeTensorFileName() function indebug_node_inputs_outputs_utils.cc so that tensors from different EP runs are saved in distinct files. Would this change be acceptable to you?

Thanks.

Dec 10 '25 03:12 yuhuchua-qti

Hi @edgchen1 ,

I was not aware of the existing I/O dumping infrastructure earlier, but I have now tried this functionality and confirmed that it successfully dumps input and output data (including input, output, and expected output) from accuracy tests. To better leverage this infrastructure in the current ORT, I have one discussion point:

Update logic in debug_node_inputs_outputs_utils.cc In accuracy unit tests, the UT model may be executed two or three times with different EPs (e.g., ORT CPU and QNN HTP). Currently, the naming logic for dumped tensors is identical if the tensor names match across different inference sessions, which leads to file overwriting. To avoid this, we propose updating the naming logic in the MakeTensorFileName() function indebug_node_inputs_outputs_utils.cc so that tensors from different EP runs are saved in distinct files. Would this change be acceptable to you?

Thanks.

glad to know it may be reusable. sure, updating the file naming logic sounds reasonable to me.

Dec 11 '25 22:12 edgchen1

Hi @edgchen1, I’ve updated the PR based on our previous discussion. Could you please review it again when you have a chance? Thank you!

Dec 22 '25 04:12 yuhuchua-qti