rocRAND icon indicating copy to clipboard operation
rocRAND copied to clipboard

Update device-specific config headers with benchmark tuning results

Open umfranzw opened this issue 1 year ago • 1 comments

I ran the benchmark tuning code and scripts on the remaining architectures in our GPU support matrix. This change updates the config headers with the results.

umfranzw avatar Apr 02 '24 17:04 umfranzw

LGTM, but waiting for the PTS pipeline to run to confirm performance changes first

stanleytsang-amd avatar Apr 02 '24 22:04 stanleytsang-amd

@umfranzw Can you address the merge conflicts for this PR? Thanks

stanleytsang-amd avatar Apr 17 '24 17:04 stanleytsang-amd

@stanleytsang-amd - I've fixed up the merge conflicts. Sorry for the delay on this. In conflict cases where there were two different values for the same architecture, I've preferred StreamHPC's updated numbers over mine, since their benchmarks were run more recently.

umfranzw avatar Apr 19 '24 20:04 umfranzw

The PTS reports indicate a sobol32 regression of slightly less than -5% here for the uniform_double distribution. sobol32 is actually not a part of the config tuning process yet, so this change should (in theory) not really affect its performance. Running the benchmark manually with a larger number of trials (10000) on gfx90a seems to cause sobol32 regression to decrease in magnitude (to about -4.5%). @RobsonRLemos, just wanted to confirm that this is acceptable. If so, I think we can merge this PR.

umfranzw avatar Apr 22 '24 19:04 umfranzw

Added missing newlines at the ends of two of the config files.

umfranzw avatar Apr 22 '24 19:04 umfranzw