Microphysics icon indicating copy to clipboard operation
Microphysics copied to clipboard

Compute some Jacobian terms at compilation time

Open maxpkatz opened this issue 2 years ago • 5 comments

The current species Jacobian implementation inefficiently loops through every rate at runtime, computes whether it is used in a given Jacobian element, and is then computed. But we know at compile time which rates are used, so we can limit the calculation to only those rates, and skip a lot of runtime branching. This has already been done for the RHS and now the Jacobian is treated using a similar approach.

maxpkatz avatar Jul 31 '21 03:07 maxpkatz

I sometimes use a setup like this to test performance:

./main3d.gnu.CUDA.ex inputs_aprox13 amrex.the_arena_init_size=0 n_cell=64 max_grid_size=1024 unit_test.dens_min=1.e7 un
it_test.dens_max=1.e7 unit_test.temp_min=1.e8 unit_test.temp_max=1.e8

This means that every zone does basically the same thing and there's only a few RHS calls per zone, so this is a useful way of checking whether there are any inefficiencies in the code implementation (separately from the known algorithmic problem of differing number of ODE steps in the general case). For this setup on V100 this change improves performance by over 2x (183 ms -> 83 ms), and instead of the Jacobian construction being the bottleneck, the linear algebra becomes the bottleneck as expected.

maxpkatz avatar Jul 31 '21 03:07 maxpkatz

tests: http://groot.astro.sunysb.edu/Microphysics/test-suite/gfortran/2021-07-30-002/index.html

zingale avatar Jul 31 '21 15:07 zingale

http://groot.astro.sunysb.edu/Microphysics/test-suite/gfortran/2021-07-31-002/index.html

zingale avatar Jul 31 '21 16:07 zingale

some compilation failures: http://groot.astro.sunysb.edu/Microphysics/test-suite/gfortran/2021-10-07-003/index.html

zingale avatar Oct 07 '21 18:10 zingale

Setting this to draft since it can be rewritten after #791 is in.

maxpkatz avatar Oct 20 '21 18:10 maxpkatz