onnxruntime icon indicating copy to clipboard operation
onnxruntime copied to clipboard

[mlas] Speed up tanhf activation function

Open r-devulap opened this issue 9 months ago • 30 comments

Description

New faster algorithm for tanhf activation function using Intel SVML.

Motivation and Context

Improves performance of tanhf by nearly 40%. The newer algorithm also fixes a bug in the current tanhf algorithm which goes out of bounds [-1, 1]. Example: for x = +0x1.06417ep+003, tanhf= +0x1.000002p+000.

Benchmark                                                 Time             CPU      Time Old      Time New       CPU Old       CPU New
--------------------------------------------------------------------------------------------------------------------------------------
[BM_Tanh vs. BM_Tanh]/40000/real_time                  -0.3822         -0.3825         15059          9304         15035          9283
[BM_Tanh vs. BM_Tanh]/80000/real_time                  -0.3845         -0.3844         30055         18499         29998         18467
[BM_Tanh vs. BM_Tanh]/160000/real_time                 -0.3146         -0.3144         17803         12203         17762         12178
[BM_Tanh vs. BM_Tanh]/320000/real_time                 -0.3495         -0.3491         32840         21362         32724         21300
[BM_Tanh vs. BM_Tanh]/640000/real_time                 -0.3563         -0.3568         62902         40487         62754         40361
[BM_Tanh vs. BM_Tanh]/1280000/real_time                -0.3326         -0.3333        128536         85780        128102         85408
OVERALL_GEOMEAN                                        -0.3538         -0.3539             0             0             0             0

r-devulap avatar May 08 '24 20:05 r-devulap

@r-devulap please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.

@microsoft-github-policy-service agree [company="{your company}"]

Options:

  • (default - no company specified) I have sole ownership of intellectual property rights to my Submissions and I am not making Submissions in the course of work for my employer.
@microsoft-github-policy-service agree
  • (when company given) I am making Submissions in the course of work for my employer (or my employer has intellectual property rights in my Submissions by contract or applicable law). I have permission from my employer to make Submissions and enter into this Agreement on behalf of my employer. By signing below, the defined term “You” includes me and my employer.
@microsoft-github-policy-service agree company="Microsoft"
Contributor License Agreement

Contribution License Agreement

This Contribution License Agreement (“Agreement”) is agreed to by the party signing below (“You”), and conveys certain license rights to Microsoft Corporation and its affiliates (“Microsoft”) for Your contributions to Microsoft open source projects. This Agreement is effective as of the latest signature date below.

  1. Definitions. “Code” means the computer software code, whether in human-readable or machine-executable form, that is delivered by You to Microsoft under this Agreement. “Project” means any of the projects owned or managed by Microsoft and offered under a license approved by the Open Source Initiative (www.opensource.org). “Submit” is the act of uploading, submitting, transmitting, or distributing code or other content to any Project, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Project for the purpose of discussing and improving that Project, but excluding communication that is conspicuously marked or otherwise designated in writing by You as “Not a Submission.” “Submission” means the Code and any other copyrightable material Submitted by You, including any associated comments and documentation.
  2. Your Submission. You must agree to the terms of this Agreement before making a Submission to any Project. This Agreement covers any and all Submissions that You, now or in the future (except as described in Section 4 below), Submit to any Project.
  3. Originality of Work. You represent that each of Your Submissions is entirely Your original work. Should You wish to Submit materials that are not Your original work, You may Submit them separately to the Project if You (a) retain all copyright and license information that was in the materials as You received them, (b) in the description accompanying Your Submission, include the phrase “Submission containing materials of a third party:” followed by the names of the third party and any licenses or other restrictions of which You are aware, and (c) follow any other instructions in the Project’s written guidelines concerning Submissions.
  4. Your Employer. References to “employer” in this Agreement include Your employer or anyone else for whom You are acting in making Your Submission, e.g. as a contractor, vendor, or agent. If Your Submission is made in the course of Your work for an employer or Your employer has intellectual property rights in Your Submission by contract or applicable law, You must secure permission from Your employer to make the Submission before signing this Agreement. In that case, the term “You” in this Agreement will refer to You and the employer collectively. If You change employers in the future and desire to Submit additional Submissions for the new employer, then You agree to sign a new Agreement and secure permission from the new employer before Submitting those Submissions.
  5. Licenses.
  • Copyright License. You grant Microsoft, and those who receive the Submission directly or indirectly from Microsoft, a perpetual, worldwide, non-exclusive, royalty-free, irrevocable license in the Submission to reproduce, prepare derivative works of, publicly display, publicly perform, and distribute the Submission and such derivative works, and to sublicense any or all of the foregoing rights to third parties.
  • Patent License. You grant Microsoft, and those who receive the Submission directly or indirectly from Microsoft, a perpetual, worldwide, non-exclusive, royalty-free, irrevocable license under Your patent claims that are necessarily infringed by the Submission or the combination of the Submission with the Project to which it was Submitted to make, have made, use, offer to sell, sell and import or otherwise dispose of the Submission alone or with the Project.
  • Other Rights Reserved. Each party reserves all rights not expressly granted in this Agreement. No additional licenses or rights whatsoever (including, without limitation, any implied licenses) are granted by implication, exhaustion, estoppel or otherwise.
  1. Representations and Warranties. You represent that You are legally entitled to grant the above licenses. You represent that each of Your Submissions is entirely Your original work (except as You may have disclosed under Section 3). You represent that You have secured permission from Your employer to make the Submission in cases where Your Submission is made in the course of Your work for Your employer or Your employer has intellectual property rights in Your Submission by contract or applicable law. If You are signing this Agreement on behalf of Your employer, You represent and warrant that You have the necessary authority to bind the listed employer to the obligations contained in this Agreement. You are not expected to provide support for Your Submission, unless You choose to do so. UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING, AND EXCEPT FOR THE WARRANTIES EXPRESSLY STATED IN SECTIONS 3, 4, AND 6, THE SUBMISSION PROVIDED UNDER THIS AGREEMENT IS PROVIDED WITHOUT WARRANTY OF ANY KIND, INCLUDING, BUT NOT LIMITED TO, ANY WARRANTY OF NONINFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE.
  2. Notice to Microsoft. You agree to notify Microsoft in writing of any facts or circumstances of which You later become aware that would make Your representations in this Agreement inaccurate in any respect.
  3. Information about Submissions. You agree that contributions to Projects and information about contributions may be maintained indefinitely and disclosed publicly, including Your name and other information that You submit with Your Submission.
  4. Governing Law/Jurisdiction. This Agreement is governed by the laws of the State of Washington, and the parties consent to exclusive jurisdiction and venue in the federal courts sitting in King County, Washington, unless no federal subject matter jurisdiction exists, in which case the parties consent to exclusive jurisdiction and venue in the Superior Court of King County, Washington. The parties waive all defenses of lack of personal jurisdiction and forum non-conveniens.
  5. Entire Agreement/Assignment. This Agreement is the entire agreement between the parties, and supersedes any and all prior agreements, understandings or communications, written or oral, between the parties relating to the subject matter hereof. This Agreement may be assigned by Microsoft.

/azp run ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux GPU CI Pipeline,orttraining-amd-gpu-ci-pipeline,Android CI Pipeline,iOS CI Pipeline,ONNX Runtime React Native CI Pipeline

yufenglee avatar May 08 '24 21:05 yufenglee

/azp run Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Android CI Pipeline

yufenglee avatar May 08 '24 21:05 yufenglee

Azure Pipelines successfully started running 7 pipeline(s).

azure-pipelines[bot] avatar May 08 '24 21:05 azure-pipelines[bot]

Azure Pipelines successfully started running 10 pipeline(s).

azure-pipelines[bot] avatar May 08 '24 21:05 azure-pipelines[bot]

Please add a benchmark for the tanh activation function in onnxruntime/test/mlas/bench/.

There is already a benchmark for tanhf BM_Tanh. Is this not sufficient? https://github.com/microsoft/onnxruntime/blob/69cfcba38a60d65498f94cde30cb9c2030f7255b/onnxruntime/test/onnx/microbenchmark/activation.cc#L342-L344

Once you've done that, make sure to > record the performance number both with and without your patch in the commit message.

The performance numbers of BM_Tanh before and after have already been included in the commit message: See https://github.com/microsoft/onnxruntime/pull/20612/commits/c6c93092f333650f126d0a83bce3340a4a179eb4

r-devulap avatar May 09 '24 16:05 r-devulap

/azp run ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux GPU CI Pipeline,orttraining-amd-gpu-ci-pipeline,Android CI Pipeline,iOS CI Pipeline,ONNX Runtime React Native CI Pipeline

yufenglee avatar May 11 '24 02:05 yufenglee

/azp run Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Android CI Pipeline

yufenglee avatar May 11 '24 02:05 yufenglee

Azure Pipelines successfully started running 7 pipeline(s).

azure-pipelines[bot] avatar May 11 '24 02:05 azure-pipelines[bot]

Azure Pipelines successfully started running 10 pipeline(s).

azure-pipelines[bot] avatar May 11 '24 02:05 azure-pipelines[bot]

/azp run ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux GPU CI Pipeline,orttraining-amd-gpu-ci-pipeline,Android CI Pipeline,iOS CI Pipeline,ONNX Runtime React Native CI Pipeline

yufenglee avatar May 13 '24 23:05 yufenglee

/azp run Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Android CI Pipeline

yufenglee avatar May 13 '24 23:05 yufenglee

Azure Pipelines successfully started running 7 pipeline(s).

azure-pipelines[bot] avatar May 13 '24 23:05 azure-pipelines[bot]

Azure Pipelines successfully started running 10 pipeline(s).

azure-pipelines[bot] avatar May 13 '24 23:05 azure-pipelines[bot]

/azp run Linux Android Emulator QNN CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU TensorRT CI Pipeline, Windows ARM64 QNN CI Pipeline, Windows CPU CI Pipeline , Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows x64 QNN CI Pipeline

yufenglee avatar May 17 '24 02:05 yufenglee

Azure Pipelines successfully started running 8 pipeline(s).

azure-pipelines[bot] avatar May 17 '24 02:05 azure-pipelines[bot]

You need sign the license/cla agreement to move on.

yufenglee avatar May 17 '24 02:05 yufenglee

You need sign the license/cla agreement to move on.

Yup, working on it.

r-devulap avatar May 17 '24 17:05 r-devulap

/azp run ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux GPU CI Pipeline,orttraining-amd-gpu-ci-pipeline,Android CI Pipeline,iOS CI Pipeline,ONNX Runtime React Native CI Pipeline

yufenglee avatar May 17 '24 18:05 yufenglee

/azp run Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Android CI Pipeline

yufenglee avatar May 17 '24 18:05 yufenglee

Azure Pipelines successfully started running 7 pipeline(s).

azure-pipelines[bot] avatar May 17 '24 18:05 azure-pipelines[bot]

Azure Pipelines successfully started running 10 pipeline(s).

azure-pipelines[bot] avatar May 17 '24 18:05 azure-pipelines[bot]

/azp run Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Android CI Pipeline

yufenglee avatar May 17 '24 19:05 yufenglee

Azure Pipelines successfully started running 10 pipeline(s).

azure-pipelines[bot] avatar May 17 '24 19:05 azure-pipelines[bot]

/azp run Linux Android Emulator QNN CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU TensorRT CI Pipeline, Windows ARM64 QNN CI Pipeline, Windows CPU CI Pipeline , Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows x64 QNN CI Pipeline

yufenglee avatar May 17 '24 20:05 yufenglee

Azure Pipelines successfully started running 8 pipeline(s).

azure-pipelines[bot] avatar May 17 '24 20:05 azure-pipelines[bot]

@yufenglee Couple of questions I need help with:

  1. I'm unable to replicate the Windows CI pipeline run fail locally (fail log here). The test LSTMTest.BackwardCompute passes for me. Any pointer on why its behaving different?

  2. The other failure in Windows GPU CI Pipeline (see log) fails a test onnxruntime_test_all -> ModelTests/ModelTest.Run/fp16_coreml_FNS_Candy_opset7_CPU. But my local build doesn't contain the ModelTests* set at all. How do I build these tests?

r-devulap avatar May 20 '24 19:05 r-devulap

@yufenglee Couple of questions I need help with:

  1. I'm unable to replicate the Windows CI pipeline run fail locally (fail log here). The test LSTMTest.BackwardCompute passes for me. Any pointer on why its behaving different?
  2. The other failure in Windows GPU CI Pipeline (see log) fails a test onnxruntime_test_all -> ModelTests/ModelTest.Run/fp16_coreml_FNS_Candy_opset7_CPU. But my local build doesn't contain the ModelTests* set at all. How do I build these tests?

The 1st failure is with DML EP on. Did you build with dml ep enabled? For the 2nd issue, @snnn , is it possible that we can share the model?

yufenglee avatar May 20 '24 22:05 yufenglee

The 1st failure is with DML EP on. Did you build with dml ep enabled?

Nope, let me try building with DirectML.

For the 2nd issue, @snnn , is it possible that we can share the model?

You probably don't need to if we can just help figure out why this commit https://github.com/microsoft/onnxruntime/pull/20612/commits/5329f70d3b71f9261add1b1307f6c99364b8b624 isn't being applied to this model ModelTests/ModelTest.Run/fp16_coreml_FNS_Candy_opset7_CPU.

r-devulap avatar May 21 '24 15:05 r-devulap

For the 2nd issue, @snnn , is it possible that we can share the model?

Sorry, we cannot. We may consider removing these models from our build pipelines if there is no better way to handle them.

snnn avatar May 21 '24 16:05 snnn