Use a GPU to increase performance
I was thinking of, because it says that it can use multiple cores, that GPUs have lots of cores and so increase performance even further. How hard would that be to be implemented and how much would be the performance gain?
Sure, can you pinpoint a routine for a generic GPU would outperform or have equal performance to a routine for a generic CPU? Moreover, how would you go about tuning for vastly different architechtures? What API would you use for the GPU?
GPU parallelization comes from SIMD, and the range of algorithms that can take advantage of such a paradigm is limited. CPUs also make use of SIMD, but usually when one says multiple cores, one thinks of MIMD, each core is following its own set of instructions.
I think the most obvious place to take advantage of GPUs is linear algebra over small fields, as Magma does: https://magma.maths.usyd.edu.au/magma/handbook/text/61#611
It should already be possible to do GPU-accelerated linear algebra by linking FLINT to a GPU-backed BLAS. I don't recall anyone reporting trying this.
I'm just not comfortable trying this from a build-system perspective. Perhaps we should bring this up at the workshop?
In theory, there shouldn't be anything for us to do, just the user specifying --with-blas with something like NVBLAS installed on the system.
i've looked around on the internet and found arrayfire, wich does some interesting stuff. i dont really know tho if it has stuff like integer multiplication. ive heard of integer multiplication (of very large numbers) being able to be broken down into very many similar operations, so maybe it is possible also with SIMD.
I forgot to bring this up during the workshop. However, my question is still:
- Which API?
- What GPUs would we support?
- Do we set threshold ourselves, or do we let the user decide whether computations with GPUs are worth it or not?
idk very much about gpus and how to compute stuff on them but here is my suggestion:
- idk
- probably the most mainstream ones like nvidia amd and igpus
- i think a small guess could decide it for each computation
- We are not a CAS, the user decides. GPUs are very heterogeneous in performance...