MIOpen icon indicating copy to clipboard operation
MIOpen copied to clipboard

Batchnorm Fused Inference OpenCL to HIP

Open sgundabo opened this issue 1 year ago • 1 comments

This PR focuses on converting the Batch Norm Fused Inference kernel from OpenCL to HIP. This conversion is a part of the broader initiative to translate all OpenCL kernels within MIOpen, as the OpenCL backend has been deprecated.

Ensuring correctness: The PR includes a GTEST that compares the output of the OpenCL kernel with the HIP implementation. The test cases are derived from the existing batch norm forward inference kernel GTEST. Enabling the performance test FLAG should test the kernel on a wider range of tensor sizes that are calculated based on the architecture.

Ensuring GPU Performance parity: The GTEST also measures the minimum, maximum, mean, median, and standard deviation of the kernel execution time across five runs and records the data in a CSV file. This data is used to create graphs that illustrate the average performance improvement of the HIP implementation over OpenCL. An average performance gain greater than one is considered favorable. The OpenCL FP16 kernel is broken, hence only the correctness of the FP16 implementation is verified.

TODO: Collect perf metrics a gfx90a, and ensure parity.

Ensuring Host side Performance parity: As the OpenCL backend support is deprecated in MIOpen, the assumption is that this decision was made while being aware of the compilation overhead of HIP kernels over OpenCL.

sgundabo avatar Jul 24 '24 22:07 sgundabo

Perf Raw Data gfx90a BatchNormInferFusedInfo.zip

Perf FP32 BatchNormInferFusedExtraInfo_FP32

sgundabo avatar Jul 30 '24 17:07 sgundabo

It can be safely merged since it does not affect production code. I just don't want to lose this PR.

CAHEK7 avatar Sep 12 '24 15:09 CAHEK7

That PR is also important, since it provides centralized definition for a set of activation functions (for example for #3247 where it has got a local definition for sigmoid https://github.com/ROCm/MIOpen/pull/3247/files#diff-2a117e014b2a1c04feb3ede9723a78a8d11d656b4e9631fa57ab1d7c58df55d6) @junliume

CAHEK7 avatar Sep 13 '24 14:09 CAHEK7