aronnax icon indicating copy to clipboard operation
aronnax copied to clipboard

Style for invalid value checking

Open axch opened this issue 9 years ago • 3 comments

Self-checking for infinities as well as NaN seems like a reasonable sanity precaution to take, and maintain in the production system.

axch avatar Feb 21 '17 00:02 axch

I agree. I'd put this fairly low down the priority list though - I've not had a simulation in which the model was able to carry around infinities without producing NaN values very quickly. The way that these bad values propagate quickly from one variable to the others is the reason why the code only checks for NaNs in the h field. https://github.com/edoddridge/MIM/blob/master/MIM.f90#L760

It is also highly likely that there is a more efficient way of testing for bad values than the one currently implemented. This subroutine is from my very early days with Fortran.

edoddridge avatar Mar 05 '17 02:03 edoddridge

There is a second question here, prompted by the observation that the code that writes out the average value dumps checks the h array for NaNs (again), rather than hav. The question is, where do we want to be on the spectrum between catching invalid values early vs spending little effort on validity checks?

Reasonable choices include:

  • Check every state array on every time-step. This catches the first NaN; may also want to dump the state off schedule so that the NaN can be better debugged.
  • Check every state array, or one state array, on a fixed schedule, such as every 10 time steps.
  • Check every array that is dumped to the output. This maintains the invariant that there will be no undetected NaNs in the output dumps.
    • Con: If output dumps are rare, a simulation may run for a respectable extra amount of time before stopping due to a NaN.
    • Pro: Don't need to write extra code for off-schedule output dumping, since it's already being dumped.
  • Status quo: check just one array at output dump time. Not sure what is gained by not checking all of them.

axch avatar Mar 21 '17 11:03 axch

The options described above could be tagged to the debug_level parameter: as the debug_level is increased, the model becomes stricter about checking for invalid values. This would obviate any speed concerns associated with regularly checking arrays, while still providing the ability to detect the first NaN, if the user desires that behaviour.

edoddridge avatar Jun 23 '17 16:06 edoddridge