Dropout kernel OpenCL to HIP + gtest
Initial attempt at translating the Dropout OpenCL Kernel to HIP with a GTEST, with hardcoded PRNG matrices replaced with rocrand function calls.
@CAHEK7 @amberhassaan
Please find the profiling results attached below.
@sgundabo Just for a reference - what king of gpu did you use to get those results? And I guess we need some bigger tensors.
@CAHEK7 @amberhassaan Please find the profiling results attached below.
@sgundabo Just for a reference - what king of gpu did you use to get those results? And I guess we need some bigger tensors.
These results are from a gfx90a.
PerfData
DropoutPerf_large_5.csv DropoutPerf_large_4.csv DropoutPerf_large_3.csv DropoutPerf_large_2.csv DropoutPerf_large_1.csv
HW tested: gfx90a
FP32 Perf
FP16 Perf
Raw Perf data with detailed kernel information DropoutPerf_FP16.zip DropoutPerf_FP32.zip
HW tested: gfx90a
FP32 Perf
FP16 Perf
FP32 Perf Analysis
Boxplot dropout0.50 mask0
Boxplot all dropouts and mask combinations
@junliume could you merge this one, as there are few other PRs depend on it.
@sgundabo @CAHEK7 please feel free to ping me on any other channels. Also marking the urgency level of the PR would help for visibility. Thanks!