rodent icon indicating copy to clipboard operation
rodent copied to clipboard

Remove `rv_all` from gpu generated code

Open PearCoding opened this issue 3 years ago • 2 comments

GPU kernels get polluted by the rv_all instruction. The instruction will be filtered out only if Thorin is compiled with RV. This might not happen, as we should not expect RV to be available when Rodent is used only for GPU. The "fix" is quite simple and it would be great to get rid of the code duplication, but that is beyond the task of this simple PR.

PearCoding avatar Jul 28 '22 13:07 PearCoding

We should probably have some kind of generic portable SIMD intrinsics that work regardless of the platform (RV, CUDA, AMDHSA, OpenCL, Shady...)

Hugobros3 avatar Jan 10 '23 08:01 Hugobros3

Yes I agree. We should also include the "fma" instruction to the math builtins (maybe with a fallback for non LLVM, e.g, OpenCL, etc). I think there are more general purpose intrinsic which might be handy on all systems - if a well-behaving fallback can be defined.

PearCoding avatar Jan 10 '23 09:01 PearCoding