Results 9 comments of Chao Liu

Sorry, accidentally closed it.

@poyenc If you see any issue with the coding, please add them in the description above. I will fix them in a major refactor PR

@illsilin Could you evaluate the applicability of @rosenrodt 's suggestion?

more specifically, we should have a single device operation class that support 1D~5D tensor elementwise operation, instead of 5 device operation classes.

@ltqin Please disable buffer_load for gfx1030.

we should deprecate this solver/kernel, because we would use following instead: ```ConvHipImplicitGemmBwdDataV1R1Xdlops``` ```gridwise_convolution_backward_data_implicit_gemm_v1r1_xdlops_nchw_kcyx_nkhw.cpp```

@ltqin, please follow up with @whchung 's request

@whchung The IR/ISA dump is submitted to JIRA http://ontrack-internal.amd.com/browse/SWDEV-253624

@ltqin Please test if rocm3.9 fixes the issue. If not, please comment on JIRA and ask Mark for a hip-clang package with the fix http://ontrack-internal.amd.com/browse/SWDEV-253624?focusedCommentId=6425343&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-6425343