Valentin Churavy

Results 1415 comments of Valentin Churavy

That would be great!

I will note that you are using floating point atomics and the OP uses integer atomics. There was an recommendation from LUMI to try out unsafe-atomics. E.g. https://reviews.llvm.org/D91546?id=305522

Okay so on LUMI the C++ code is already doing something quite different from Julia even without `unsafe-fp-atomics` Can you get the LLVM IR from hipcc on LUMI with something...

That should be okay. I think you might also need `-c` since you don't want to link.

You don't want `--shared -fIC` since your `-o` should be a `.ll`

So we are just emitting `atomicrmw fadd double addrspace(1)* %73, double %51 seq_cst, align 8, !dbg !374` So the issue might be that we emit `seq_cst` instead of acq_rel

But now you are missing `-emit-llvm -S --offload-device-only -c`?

Can you also send me the one withtout `unsafe-fp-atomics`, but this is already promising: ``` %41 = load i32, i32 addrspace(1)* %23, align 4, !tbaa !11, !amdgpu.noclobber !5 %42 =...

Oh wow that is even more interesting. ``` %41 = atomicrmw fadd double addrspace(1)* %36, double %29 syncscope("agent-one-as") monotonic, align 8 %42 = atomicrmw fadd double addrspace(1)* %38, double %29...

This is consistent with `Base.@atomic`: ``` help?> @atomic @atomic var @atomic order ex Mark var or ex as being performed atomically, if ex is a supported expression. If no order...