jcuda
jcuda copied to clipboard
HIP support
Sorry if this is not the best place to discuss this, but here goes:
AMD has recently pivoted in its GPGPU strategy. They have a new open-source software stack, ROCm, for their Fiji (R9 Fury) GPUs and future products, and have seemingly abandoned OpenCL.
https://radeonopencompute.github.io/
ROCm exposes a number of alternative APIs. They have the low-level ROCR (C host API and assembly kernel language) (sample), the high-level HC (C++ host API and C++ kernel language), and the CUDA-emulating HIP API (C host API and C++ kernel language).
Here is a comparison table of syntax between the various APIs: https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP/blob/master/docs/markdown/hip_terms.md
The HIP API is particularly relevant for this project. It's basically a platform-independent CUDA Driver API that is compatible with both AMD's stack and NVIDIA's stack. This is potentially a very exciting development.
HIP page: https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP HIP blog: http://gpuopen.com/tag/hip/
According to AMD:
- HIP is very thin and has little or no performance impact over coding directly in CUDA or hcc "HC" mode.
- HIP allows coding in a single-source C++ programming language including features such as templates, C++11 lambdas, classes, namespaces, and more.
- HIP allows developers to use the "best" development environment and tools on each target platform.
- The "hipify" tool automatically converts source from CUDA to HIP.
- Developers can specialize for the platform (CUDA or hcc) to tune for performance or handle tricky cases
It would be very cool if there was perhaps a JHIP library for Java that would allow me to write HIP code. To be honest, I have not tried HIP yet and can't comment on how well it works in practice. However, I wanted to put it on your radar.
Note, I am NOT affiliated with AMD. I just like competition :)
(Things like this may also be posted in the JCuda forum at http://forum.byte-welt.net/forumdisplay.php?f=90&langid=2 (no registration required), but we can also discuss this here).
Although I heard rumours about "something like a CUDA implementation by AMD", I didn't have this on the radar until now (I've been a bit busy recently - usually, I also try to repsond faster to opened issues).
So thanks for bringing this up here!
There already are surprisingly many resources in these repositories. It'll take some time to sort this out.
From the very first glance at the API, it looks like they just replaced cuda
with hip
;-) Although I also like competition (and open standards, in contrast to the very protective policy of NVIDIA), and had always kept an eye on things like the HSA, I wonder why AMD invests so much effort into "something new", when there already are things like OpenCL and Vulkan. (Of course, one could guess that they want a piece of the CUDA cake, but maybe there's more behind it).
So right now, I cannot say how likely it is that I'll create a "JHIP" library. I'm not sure how much effort I have to invest for this, and how much I can invest.
There are many details that have to be considered. E.g. you mentioned
HIP allows coding in a single-source C++ programming language including features such as templates, C++11 lambdas, classes, namespaces, and more.
This is, to some extent, already possible with CUDA, but the obvious problem is that there will always be a break when going into the Java world.
However, it definitely looks interesting, and I'll try to have a closer look, and try to better sort my thoughts about this.
I will repost this on the forum so that others can participate in the discussion.
Since I'd like to start playing with HIP and want to stay in Java land, I'm going to initiate the porting effort and see how it goes. I hope you'll be available in case I run into problems. One person from the HIP side has offered to help, so I think this project should go smooth as butter.
One quick question: did you write all of the code by hand, or did you automate it?
I think AMD gave up on OpenCL because OpenCL is stupid (so to speak) and the abstractions that it envisioned have worked to hinder the programmer rather than help him. (For example, they thought they could avoid explicit mem copies :eyeroll:) Their new effort is closer to the metal. (In fact, they're abandoning chunks of HSA too, and for similar reasons, stating that they'd rather optimize their LLVM IR-to-binary compiler than the LLVM-to-HSAIL-to-binary path.) I think the driving force behind these decisions was that the vision for OpenCL (and HSA, I guess) was to be a standard that crossed whole classes of hardware, when in fact the only accelerator that anyone wants to write code for is the familiar dedicated GPU. Keep It Simple, Stupid.
(Then the discussion may proceed at the thread that you opened: https://forum.byte-welt.net/byte-welt-projekte-projects/jcuda/20996-hip-amds-cuda-clone-support.html?langid=2 - I'll not close the issue here, let's see how it evolves...)