b-sumner

Results 105 comments of b-sumner

Yes, but using OCKL insulates you from any future ISA changes. It will always work.

@yxsamliu it is possible, but of course it would not match the cuda results on nvidia. Will that not be a problem?

@yxsamliu __sinf is a Cuda/HIP function that is implemented with a call to the native sin function, while sin is implemented with a call to the regular OCML sin function....

@yxsamliu exactly. However, I'd like to note that implementing this could break existing applications that are somehow dependent on the higher accuracy.

Sounds reasonable to me. We'll need to be sure to document this change.

Hi @fromtheeast710 . I'm not sure why this is being reported to ROCm and not the LAMMPS developers who probably have not updated their code to handle GFX10+.

Hello @fromtheeast710 the compiler is correctly stating that the source code is attempting to use an instruction field that is not supported by gfx1030. This is much preferable to the...

@fromtheeast710 the compiler treats gfx1030 and gfx1032 as separate targets; they have different names after all. Forcing ISA from one to run on another is risky.

Those functions were not removed, but their definitions were moved. It still appears something is not consistent in this build.

@jchlanda, (self & ~(width-1)) is the lowest lane in the group of width lanes that includes self. If index, the source lane, is below that value, then the shuffle up...