adda
adda copied to clipboard
Iterative solver on OpenCL (GPU) devices
On recent GPU devices the matrix vector multiplication in adda is as fast as
the preparation of the next argument vector within the iterative solver
(currently done by the CPU). Therefore the iterative solver should also run on
GPU to avoid transferring vectors from host to device each iteration and to
speed-up the computation. Since most of the functions executed by the iterative
solvers in adda are level1 (vector) basic linear algebra functions, potentially
the clAmdBlas library can be employed to improve the execution speed also.
This would mainly improve computation speed on larger grids and high dipole
counts.
Original issue reported on code.google.com by [email protected] on 31 May 2014 at 3:36