b-sumner
b-sumner
Sorry, not yet. I'll see if we can get this looked at soon.
We have been able to spend some time on this, but unfortunately not enough yet to root cause and come up with a fix.
We don't have a fix yet, but here is an additional compile option that worked for us: -Wb,-simplifycfg-sink-common=0
The work is in progress, but not complete. It won't be released "soon", but I will update this when upstream LLVM will have it.
I apologize for not getting back to update this. I believe the fix was in ROCm 3.10 and onwards.
The link to HeCBench above is not working for me.
@NiuKeke I'm not sure we're clear on what kind of accuracy you mean. Could you elaborate?
Hi @yulingao can you comment on your features? kernarg is not a concept of HIP, its an implementation detail. So it's hard to understand why you care about kernarg if...
Hi @yulingao thanks for the information. I can think of a few ways to do that, but only with some pretty significant restrictions. For example, many HIP (and Cuda) programs...
I'd suggest using extern "C" int __ockl_bfe_i32(int x, uint s, uint w); and extern "C" uint __ockl_bfe_u32(uint x, uint s, uint w); where "x" is the value, "s" is the...