mercury
mercury copied to clipboard
IEEE invalid operation when checking for central body collision
Hi all,
I sometimes stumbled across the following warning message when running Mercury:
Note: The following floating-point exceptions are signalling: IEEE_INVALID_FLAG
With not much more to go on, I added the gfortran compiler flag -ffpe-trap=invalid (and removed -O2 as otherwise the debugger didn't shoe the correct line) to find the problem: calling acos() for an undefined value. Here is the part (line 963 in mercury6_2.for):
u0 = sign (acos((1.d0 - r0/a )/e), rv0)
I then added some debug output to check the values of r0, a, e and rv0 right before the error:
| parameter | value |
|---|---|
| r0 | 8699.8554879043295 |
| a | 4349.9294907593367 |
| e | 0.99999919683544536 |
| inside acos | -1.0000000000218023 |
| rv0 | -8.0957151629424393E-006 |
I run large suites of simulations, keeping big.in the same but having randomized particles in small.in, so every run is slightly different, and it doesn't happen in most runs. I attached the initial conditions for a sample run where the bug does occur.
- System: Ubuntu 18.04.3 LTS (GNU/Linux 4.15.0-76-generic x86_64)
- gfortran: GNU Fortran (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0 using the provided Makefile
- Mercury: most up-to-date version of this repository, except for three parameters I changed in mercury.inc before compiling (CMAX = 1000 and NFILES = 200)
A possible solution is referenced in #18, but it's probably not the best way as it masks possible other bugs as was pointed out.
Cheers
I reproduced this using the supplied initialization files. My system:
Description: Ubuntu Focal Fossa (development branch)
Release: 20.04
Codename: focal
----- OS kernel name/release/version -----
5.4.0-12-generic #15-Ubuntu SMP Tue Jan 21 15:12:29 UTC 2020
----- Machine hardware platform -----
x86_64
----- gfortran
9.2.1-28ubuntu1
mercury 6 stdout/stderr:
Integrating massive bodies and particles up to the same epoch.
Beginning the main integration.
Time: 1368.925 years dE/E: 4.69510E-08 dL/L: 3.48461E-14
Time: 2737.851 years dE/E: -7.21165E-07 dL/L: 4.09303E-14
Time: 4106.776 years dE/E: -6.46237E-07 dL/L: 5.47581E-14
Time: 5475.702 years dE/E: -1.01207E-06 dL/L: 5.27300E-14
Time: 6844.627 years dE/E: -6.24822E-07 dL/L: 2.06495E-14
Time: 8213.552 years dE/E: -7.97592E-07 dL/L: 8.11232E-15
Time: 9582.478 years dE/E: -5.56704E-07 dL/L: 1.56346E-13
Integration complete.
Note: The following floating-point exceptions are signalling: IEEE_INVALID_FLAG
Added flag -ffpe-trap=invalid:
Integrating massive bodies and particles up to the same epoch.
Beginning the main integration.
Time: 1368.925 years dE/E: 4.69510E-08 dL/L: 3.48461E-14
Time: 2737.851 years dE/E: -7.21165E-07 dL/L: 4.09303E-14
Time: 4106.776 years dE/E: -6.46237E-07 dL/L: 5.47581E-14
Time: 5475.702 years dE/E: -1.01207E-06 dL/L: 5.27300E-14
Time: 6844.627 years dE/E: -6.24822E-07 dL/L: 2.06495E-14
Time: 8213.552 years dE/E: -7.97592E-07 dL/L: 8.11232E-15
Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
Backtrace for this error:
#0 0x7f6e489aaca1 in ???
#1 0x7f6e489a9e75 in ???
#2 0x7f6e4867646f in ???
#3 0x7f6e4884db90 in ???
#4 0x5617e1d4b79e in mce_cent_
at ./mercury6_2.for:963
#5 0x5617e1d44608 in mal_hcon_
at ./mercury6_2.for:697
#6 0x5617e1d5f3a8 in MAIN__
at ./mercury6_2.for:189
#7 0x5617e1d5fb3e in main
at ./mercury6_2.for:221
Ran again with the custom parameter files supplied by @tobiscode. Note subroutine mce_cent
Using 2-body approximation. Collision with central body (q <= rcen). Time of impact relative to the end of the timestep (e < 0)
Note that algor = 10 (HYBRID) is specified in the custom param.in. If you change just the algor value back to BS like in the sample, no collision with central body occurs and, therefore, no program crash occurs.
Something is wrong with the collision with central body logic. It certainly needs protective code to avoid floating pt exceptions. This will probably need an astrophysicist to figure out.
One other thing although not related to this issue. The calculations specific to algor=11 look suspicious. Uninitialized arrays are used in calculations. Assumed to be zero? Bad practice.
My customized source files used for tracing are attached - ZIP of 2 files: mercury6_2.for and mercury.inc. I added a tracing flag which controls whether or not to trace. Once this bug is fixed, the customization isn't really needed.
Ran again with the custom parameter files supplied by @tobiscode. Changed algor=BS2 in param.in and got another central body collision: &MCE_CENT_E_LT_1 (output from line 979) P= 1.6726537648361467E-003, Q= 8.3632696282470525E-004, RCEN= 4.6547587699999997E-003, A= 4349.4098229833053 , E= 0.99999980771484021 , H= 58.974682251497846 , R0= 8698.8187078549345 , RV0= -2.4711172542430239E-004, /
but mercury6 continued to the end without issues.
Note that when algor=HYBRID, this display appears before the program crash: &MCE_CENT_E_LT_1 P= 6.9874155583106654E-003, Q= 3.4937091821670221E-003, RCEN= 4.6547587699999997E-003, A= 4349.9294907593367 , E= 0.99999919683544536 , H= 100.00000000000000 , R0= 8699.8554879043295 , RV0= -8.0957151629424393E-006, /
In summary,
- Both BS2 and HYBRID algorithms produced a central body collision.
- All of the algorithms ran without crashing for me except HYBRID.
It could be that @tobiscode tried a combination of parameters with algor=HYBRID that makes no sense (I am not qualified to judge). If that is true, then, optimally, the initialization code should be amended to diagnose such combinations. Needs an astrophysicist, I believe.
Clearly this early 1990s Fortran code was written to be used by scientists who would first check their parameters should a crash occur.
@tobiscode, @texadactyl, Hi guys! Sorry for a great response delay, it turned out that I got notifications turned off all the time for this repo. Thanks for your bug report. I have briefly looked over the discussion to understand the problem and it seems that I faced this problem several times before. I will try to come up with some ideas how to fix it when I get free of my job stuff.
This will probably need an astrophysicist to figure out.
Celestial mechanics specialist, to be precise ;-)
@idovgalyov-4xxi See? I didn't event know who the right fixer would be! (=:
@tobiscode Is this still an issue? If so, can figure out where the conflict is? If not, please close this issue.
Hi, I've moved on to different projects and am not working with mercury anymore. I ended up being fine with the "fix" I made in the PR, and then assumed that @idovgalyov-4xxi would eventually tackle the problem. If that hasn't happened, then I believe the problem is still there. I'm sorry I can't help anymore with this. Cheers