b-sumner

Results 105 comments of b-sumner

You could try -ffp-contract=fast. But unfortunately float2 means something in HIP and Cuda other that what it means in OpenCL. So using scalars may be the best approach.

@etiennemlb would it be possible for you to provide a minimal HIP application that demonstrates the issue?

Thank you. We now have an internal ticket open for this.

HIP assumes a SIMT programming model and requires the accelerator work to be expressible as a series of series of launches of up-to 3D arrays of threads/work-items, much like Cuda,...

Regarding the device libraries, they are AMD specific and assume an AMD runtime. We will not accept changes adding support for other devices or platforms. However they can be, and...