cuda_cg icon indicating copy to clipboard operation
cuda_cg copied to clipboard

Conjugate Gradient solver written in CUDA

README for Conjugate Gradient GPU Solver - Graham Markall, 30/06/2009

(Presentation referenced below is at http://www.doc.ic.ac.uk/~grm08/g_markall_fe_acceleration.ppt )

The included solver is not exactly the same as the one whose performance results are reported in the accompanying presentation - this version of the solver has been altered to make the code more straightforward, at the expense of some efficiency (more kernel launches are required).

The file gpu_solve.cu implements a jacobi-preconditioner and conjugate gradient solver using the Compressed Sparse Row matrix format. The CSR matrix is expected to contain FORTRAN indices in the row_ptr and col_idx - that is, the indices are expected to begin at 1, and not 0. The function csr_c_to_fortran function in test/sparse_conversions.h gives an example of how to convert to the 1-based indexing if you have 0-based indexing.

The solver should work on any G200-series GPU (280GTX, Tesla C1060 etc...).

To make use of the solver in your code use:

In FORTRAN:

call gpucg_solve(row_ptr, size(row_ptr), col_idx, size(col_idx), val, size(val), rhs, size(rhs), x)

This assumes that a CSR matrix is is stored in the arrays row_ptr, col_idx and val. The known right-hand side vector is stored in rhs, and x is a vector in which the solution to the system will be placed. row_ptr and col_idx are arrays of integers, val, rhs and x are arrays of doubles.

In C:

gpucg_solve_(*row_ptr, &size_row_ptr, *col_idx, &size_col_idx, *val, &size_val, *rhs, &size_rhs, *x)

This assumes that *row_ptr, *col_idx and *val contain pointers to the arrays representing the CSR matrix. *rhs is a pointer to the array storing the known right-hand side vector. x is a pointer to an area of memory where the solution may be stored. These are all of type double. size_row_ptr etc. all contain the the length of the respective array, and are integers.

To compile the solver, use:

nvcc -arch sm_XX -c gpu_solve.cu -o gpu_solve.o

Then link this with your code in the usual way.

Testing

There is a test for the CG solver in the test directory. You may need to adjust the Makefile to suit your system. The test solves a small system and checks that the solution is sufficiently close to a reference solution. To run the tests:

cd test make ./test

Problems/feedback

If you have any issues using this code please contact me at [email protected] and I will attempt to be of assistance.