NbodyGradient.jl icon indicating copy to clipboard operation
NbodyGradient.jl copied to clipboard

TTV computation comparisons

Open ericagol opened this issue 7 years ago • 13 comments

  • [x] 1). Compare with TTVFast.
  • [x] 2). Compare speed of integration with C code.
  • [x] 3). Compare precision with Rebound (if possible).
  • [ ] 4). Check the precision of the times and derivatives for long-term integrations.
  • [x] 5). Check the conservation of energy and angular momentum for long-term integrations as a function of step-size.

ericagol avatar Feb 22 '18 19:02 ericagol

We compared the integrator with the C implementation of Hernandez & Bertschinger. The Julia version is ~5-10 times slower. This is pretty surprising - in my experience - and based on looking at discussions in discourse - it's possible to get Julia code to run at C speeds.

@langfzac @dmhernan I think we should: 1). Make a comparison of the universal.c to the julia implementation, and see whether this is where most of the time difference originates. 2). Ask for help on julia discourse with an example.

ericagol avatar Jan 30 '21 22:01 ericagol

Without having seen the code, I can tell that usually making the function type-stable and reducing useless memory allocations is most of the work.

giordano avatar Jan 30 '21 23:01 giordano

@ericagol @langfzac Sure, fine by me. To isolate the Kepler solver in a problem, all pairs could be placed in Kepler group. The drifts should have negligible compute cost relative to the Kepler solver. To isolate the Kepler solver even further, it can be used to integrate a two-body problem over long times. I will send code demonstrating this.

dmhernan avatar Jan 31 '21 00:01 dmhernan

@giordano Yes, we've done that - but maybe we missed something, or could possibly could do a better job!

ericagol avatar Jan 31 '21 01:01 ericagol

@dmhernan Thanks!

ericagol avatar Jan 31 '21 01:01 ericagol

The comparison with universal.c was useful - thanks to a few small changes, we can now match the C speed. See this pull request (an older version of NbodyGradient was used in the TRAPPIST1_Spitzer repository since the tests were done in Julia v1.0.1, which the current repository is no longer compatible with):

https://github.com/ericagol/TRAPPIST1_Spitzer/commit/0a3c35c91801e845958ad28def71a0069e291954

outer_ss$ julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.0.1 (2018-09-29)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> include("test.jl")
test_kepler_step (generic function with 1 method)

julia> test_kepler_step(1)
Relative energy error after step 0.01 is dE/E= 3.5527136788004e-15
Final state: [0.0, -0.0480049, 0.047346, 0.0, -4.97779, 1.97086, 0.0, 0.0674248, 4.92801, 1.0, 0.34227, -1.57501e-12]

julia> @time test_kepler_step(10000000)
Relative energy error after step 0.01 is dE/E= -5.443423489736988e-13
Final state: [0.0, -1.98984, 0.00253468, 0.0, -0.0090298, -0.0708824, 0.0, 1.98984, 0.0089395, 1.0, 0.00502565, -3.73126e-16]
  1.480160 seconds (151 allocations: 4.766 KiB)

julia>
outer_ss$ time ./test2
Relative energy error after step 0.01 is dE/E=1.83624e-11
Final state: -1.98984 0.00253463 0 -0.00902962 -0.0708824 0

real    0m1.539s
user    0m1.534s
sys     0m0.003s

ericagol avatar Feb 02 '21 19:02 ericagol

(These two codes are currently not in the repository - please let me know if you would like to see them. The version of universal.c which was used for this test is outdated, hence the poorer conservation of energy and slight disagreement between the final states.)

ericagol avatar Feb 02 '21 19:02 ericagol

So, to improve the speed of the ah18/ttv code, we need to avoid array allocations (allocation of x0 & v0 in kepler_step was causing the biggest slow down in the above test, even though these are only 3 elements each). What would be the best way to do this? Create a set of cached arrays for use within the module? Hold them in a structure which is passed through functions at different levels of the algorithm? It would be nice if there were some way in Julia to pre-allocate some arrays for a function before it is called which will avoid the allocation and garbage collection each time the function is called.

ericagol avatar Feb 02 '21 19:02 ericagol

@ericagol I've already done some of these things in my fork. I use the State type to hold preallocated arrays for the compensated summation. There's also PreAllocArrays types for the derivative arrays. This just needs to be extended to the kepler_step/drift functions (on my to-do list..). A straight forward (but perhaps less elegant) way to do this would be to just append the relevant arrays to State, since it's already passed through the integrator.

langfzac avatar Feb 02 '21 20:02 langfzac

@giordano Thanks for the suggestion, Mosé! It did turn out that pre-allocation was causing most of the slow down. Also found some other improvements which match C speed in this test of kepler_step using the DH17 algorithm. Now @langfzac is trying to remove the pre-allocations for the full integrator with the updated algorithm (right now called AH18, but still yet to be published!).

ericagol avatar Feb 03 '21 23:02 ericagol

I'm glad you worked this out! 😃

giordano avatar Feb 03 '21 23:02 giordano

Wow, interesting feature of Julia.

dmhernan avatar Feb 04 '21 00:02 dmhernan

@dmhernan Yes, and not a feature I am fond of...

ericagol avatar Feb 04 '21 00:02 ericagol