MIOpen
MIOpen copied to clipboard
Do we need many members in PerformanceConfigAsmImplicitGemmGTC?
Originated from https://github.com/ROCmSoftwarePlatform/MIOpen/pull/1230#discussion_r737878197 (see the whole thread). Synopsis:
As far as I see, for PerformanceConfigAsmImplicitGemmGTCFwdXdlopsNHWC we only need to store table index and
gemm_k_global_split
. The rest of data can be read from the table right inGetSolution()
. Please look intoSetNextValue()
and you'll see that only index andgemm_k_global_split
are modified.
Let's discuss.
Or you can use type other than PerformanceConfigAsmImplicitGemmGTCFwdXdlopsNHWC for the vector returned by GetFwdXdlopsNHWCConfigList. For example you can use aggregate like this (pseudo code):
struct GeneratedData
{
Datatype datatype;
Layout layout;
Direction direction;
PerformanceConfigAsmImplicitGemmGTCFwdXdlopsNHWC perfConfig;
}
...
static const inline std::vector<GeneratedData>&
GetFwdXdlopsNHWCConfigList()
{
static const std::vector<GeneratedData> kernel_param_list {
{"fwd", "nhwc", miopenFloat, { ... } }, // a pair of curly braces inserted
...
}
I agree with that. I can make a PR by this way.
@carlushuang @shaojiewang Is this fixed with latest ROCm 6.0.2 (HIP 6.0.32831)? If resolved, please close ticket. Thanks!