ConvHipImplicitGemmWrwV4R4Xdlops failure due to compiler issue
Testing this PR: https://github.com/ROCmSoftwarePlatform/MIOpen/pull/413
This case fails
MIOpenDriver convfp16 -n 256 -c 512 -H 8 -W 3 -k 2048 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -F 4 -t 1
MIOpen Backward Weights Conv. Algorithm: 5, Solution: 72/ConvHipImplicitGemmWrwV4R4Xdlops
GPU Kernel Time Backward Weights Conv. Elapsed: 0.238542 ms (average)
stats: name, n, c, ho, wo, x, y, k, flopCnt, bytesRead, bytesWritten, GFLOPs, GB/s, timeMs
stats: bwdw-conv1x1u1, 256, 512, 8, 3, 1, 1, 2048, 12884901888, 0, 0, 54015, 0, 0.238542
Backward Convolution Weights Failed: 0.118431 > 0.082
However, if change compiler optimization from -O3 to -O1. It will pass, indicating this is a compiler bug
MIOpen Backward Weights Conv. Algorithm: 5, Solution: 72/ConvHipImplicitGemmWrwV4R4Xdlops
GPU Kernel Time Backward Weights Conv. Elapsed: 5.383096 ms (average)
stats: name, n, c, ho, wo, x, y, k, flopCnt, bytesRead, bytesWritten, GFLOPs, GB/s, timeMs
stats: bwdw-conv1x1u1, 256, 512, 8, 3, 1, 1, 2048, 12884901888, 0, 0, 2394, 0, 5.383096
Backward Convolution Weights Verifies on CPU and GPU (0.000219028)
@ltqin Please
- open a JIRA ticket and add "BlockingComposableKernels" tag to it: http://ontrack-internal.amd.com/issues/?jql=labels+%3D+BlockingComposableKernels
- create a workaround in MIOpen for this failed case
An example of JIRA ticket: http://ontrack-internal.amd.com/browse/SWDEV-251757
@asroy please collect IR / ISA dump between: -O1, -O2, -O3 here.
@ltqin, please follow up with @whchung 's request
This was spotted during a review of bwdw-conv1x1u1 performance for low values of w*h. As best as I can tell, the bug is only triggered when w*h is either 24 or 48.
I create a JIRA: http://ontrack-internal.amd.com/browse/SWDEV-253624
@whchung The IR/ISA dump is submitted to JIRA http://ontrack-internal.amd.com/browse/SWDEV-253624
All, lets evaluate importance of this. If it is high, then a workaround should be provided asap, right? /cc @daniellowell
@asroy Please provide a status update.
@ltqin Please test if rocm3.9 fixes the issue. If not, please comment on JIRA and ask Mark for a hip-clang package with the fix
http://ontrack-internal.amd.com/browse/SWDEV-253624?focusedCommentId=6425343&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-6425343
@ltqin Please test if rocm3.9 fixes the issue. If not, please comment on JIRA and ask Mark for a hip-clang package with the fix
http://ontrack-internal.amd.com/browse/SWDEV-253624?focusedCommentId=6425343&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-6425343
test pass on rocm3.9
@ltqin If SWDEV 251757 is fixed in ROCm 3.9, then we need to switch off the workaround, see https://github.com/ROCmSoftwarePlatform/MIOpen/pull/423#discussion_r500456160. What is exact version number or ROCm where bug disappeared? (you can find it in the log, if you have it saved)
This can be closed after workaround for SWDEV-251757 is updated.
@ppanchad-amd @junliume 🔴 The issue is not resolved, WORKAROUND_SWDEV_251757 still persists in our code. Please reopen the issue.