Chao Liu
Chao Liu
Sorry, accidentally closed it.
@poyenc If you see any issue with the coding, please add them in the description above. I will fix them in a major refactor PR
@illsilin Could you evaluate the applicability of @rosenrodt 's suggestion?
more specifically, we should have a single device operation class that support 1D~5D tensor elementwise operation, instead of 5 device operation classes.
@ltqin Please disable buffer_load for gfx1030.
we should deprecate this solver/kernel, because we would use following instead: ```ConvHipImplicitGemmBwdDataV1R1Xdlops``` ```gridwise_convolution_backward_data_implicit_gemm_v1r1_xdlops_nchw_kcyx_nkhw.cpp```
@ltqin, please follow up with @whchung 's request
@whchung The IR/ISA dump is submitted to JIRA http://ontrack-internal.amd.com/browse/SWDEV-253624
@ltqin Please test if rocm3.9 fixes the issue. If not, please comment on JIRA and ask Mark for a hip-clang package with the fix http://ontrack-internal.amd.com/browse/SWDEV-253624?focusedCommentId=6425343&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-6425343