Update device-specific config headers with benchmark tuning results
I ran the benchmark tuning code and scripts on the remaining architectures in our GPU support matrix. This change updates the config headers with the results.
LGTM, but waiting for the PTS pipeline to run to confirm performance changes first
@umfranzw Can you address the merge conflicts for this PR? Thanks
@stanleytsang-amd - I've fixed up the merge conflicts. Sorry for the delay on this. In conflict cases where there were two different values for the same architecture, I've preferred StreamHPC's updated numbers over mine, since their benchmarks were run more recently.
The PTS reports indicate a sobol32 regression of slightly less than -5% here for the uniform_double distribution. sobol32 is actually not a part of the config tuning process yet, so this change should (in theory) not really affect its performance. Running the benchmark manually with a larger number of trials (10000) on gfx90a seems to cause sobol32 regression to decrease in magnitude (to about -4.5%). @RobsonRLemos, just wanted to confirm that this is acceptable. If so, I think we can merge this PR.
Added missing newlines at the ends of two of the config files.