MIOpen icon indicating copy to clipboard operation
MIOpen copied to clipboard

Dropout kernel OpenCL to HIP + gtest

Open sgundabo opened this issue 1 year ago • 6 comments

Initial attempt at translating the Dropout OpenCL Kernel to HIP with a GTEST, with hardcoded PRNG matrices replaced with rocrand function calls.

sgundabo avatar Jun 28 '24 14:06 sgundabo

@CAHEK7 @amberhassaan

Please find the profiling results attached below.

@sgundabo Just for a reference - what king of gpu did you use to get those results? And I guess we need some bigger tensors.

CAHEK7 avatar Jul 17 '24 12:07 CAHEK7

@CAHEK7 @amberhassaan Please find the profiling results attached below.

@sgundabo Just for a reference - what king of gpu did you use to get those results? And I guess we need some bigger tensors.

These results are from a gfx90a.

sgundabo avatar Jul 17 '24 15:07 sgundabo

Raw Perf data with detailed kernel information DropoutPerf_FP16.zip DropoutPerf_FP32.zip

HW tested: gfx90a

FP32 Perf DropoutPerfRaw_FP32

FP16 Perf DropoutPerfRaw_FP16

sgundabo avatar Jul 31 '24 01:07 sgundabo

RawPerf Data DropoutPerf_smalltensors.zip

Perf FP16 DropoutPerf_smalltensors_FP16

Perf FP32 DropoutPerf_smalltensors_FP32

sgundabo avatar Aug 02 '24 00:08 sgundabo

FP32 Perf Analysis

Boxplot dropout0.50 mask0 Boxplot_min_exec_time_ratio

Boxplot all dropouts and mask combinations Boxplot_all_combinations

sgundabo avatar Aug 02 '24 00:08 sgundabo

@junliume could you merge this one, as there are few other PRs depend on it.

CAHEK7 avatar Aug 05 '24 17:08 CAHEK7

@sgundabo @CAHEK7 please feel free to ping me on any other channels. Also marking the urgency level of the PR would help for visibility. Thanks!

junliume avatar Aug 09 '24 06:08 junliume