Results 135 comments of Jun Liu

@muralinr and @DrizztDoUrden could you please take a look with me too?

Right, I tried on another gfx900 and cannot reproduce this issue either.

Tried on a gfx906 and still cannot reproduce this issue.

@DrizztDoUrden @shurale-nkn could you reproduce this issue?

@carlushuang @shaojiewang do you have vega to test if the issue is reproducible?

The problem is not reproducible on a gfx900 (with ROCm 5.0 base and ROCm 5.2 docker) Lower the urgency level. However, it is still a "high" issue since it impacts...

Now this issue is happening on gfx908 again: http://micimaster.amd.com/blue/organizations/jenkins/MLLibs%2FMIOpen/detail/issue_1576_bwdfp16gpuref/5/pipeline @JehandadKhan could we assign one host/API engineer on this issue?

After some discussion: @muralinr could you try running this test multiple times on a MI100 development node, and see if we can reproduce it? I would suggest some static code...

@atamazov yes we should not close this (automatically closed with the merged PR for WA). Actually I think the urgency of this one should be higher since now we are...