adda
adda copied to clipboard
Add compile option to use single precision
Some older versions of ADDA (before the public release) had an option to used
single precision. But it was removed because CPU have been efficient with
double precision already at that time. The only benefit of using single
precision is less memory, which is not relevant in many practical simulations.
And single precision can lead to much slower convergence of the iterative
solver, when it is slow already with double precision.
However, ADDA is expected to be able to employ GPUs - issue 118. And currently
many of them (especially the consumer-type) work much faster with single
precision than with double. Also memory requirements are more stringent for
GPUs, since fitting all the important data into the memory of GPU itself
produces a large performance boost.
Changing between single and double precision should be a matter of changing
definition of variables and replacing several floating-point functions, which
have separate versions for floats and doubles. Definitions are best done with
typedefs like "real" to be easily modified by defines in compile time. The
latter can be done conveniently with macros in <tgmath.h> - a feature of C99.
This feature is supported with recent version of gcc -
http://gcc.gnu.org/c99status.html , but some library issues remain. Moreover,
there may be issues with older versions of gcc or other compilers.
Moreover, memory allocation functions also need to be checked for possible
issues, especially a few functions provided by FFTW. Also floating point
numerical constants are defined as double, by default. So it may be a good idea
to redefine them as floats for single-precision version.
Another interesting idea is a hybrid single-double method, proposed by Evgenij
Zubko (Penttila et al. J. Quant. Spectrosc. Radiat. Transfer 106, 417-436
(2007). doi:10.1016/j.jqsrt.2007.01.026). All large arrays are allocated in
single precision, while some (or all) of the scalar coefficients used in
iterative methods are in double precision. This may improve the convergence of
the iterative solver (to something in between the values for single and double
precisions) at almost no extra memory cost (compared to purely single-precision
version).
#118
Original issue reported on code.google.com by yurkin on 2 Dec 2010 at 5:43