b-sumner
b-sumner
@VinInn, if you don't mind my asking, how did you run across this and the previous issue you raised?
Thanks. We have exhaustive tests as well, or rather tests that can be made as exhaustive as desired using command line arguments, but those arguments hadn't been set up properly...
@VinInn I noticed the paper mentions large errors for double precision acosh, atanh, and log1p, but for the inputs listed, I am seeing the library producing results with error below...
This has been fixed.
Thanks, we'll take a look.
This has been fixed.
@MathiasMagnus the device libs evolve in parallel with the compiler, and the compilers used in, and provided by, ROCm 3.0 are more recent than LLVM 9.0.1. Also, the bitcode "libraries"...
We're working on a means to avoid the device library and compiler sources from getting out of sync and will update the build instructions. Closing this as user error due...
The constraint for a scalar register is "s". The limitations in 6.2.1 of https://developer.amd.com/wp-content/resources/Vega_Shader_ISA_28July2017.pdf state that at most one SGPR may be read per VALU instruction. You should not have...
Accessing `__shared__` memory via inline ASM is not recommended and will probably cause more harm than good. What problem are you trying to solve? Why not simply code `temp_local...[i]`?