aronnax Automatically measure execution and result consistency across platforms

The relevant dimensions of variability in computing environment are

Operating system
Operating system version
Fortran compiler
Fortran compiler version
Fortran compiler optimization flag setting
Processor architecture
Scipy version, while the Python depends on that
NetCDF library version, if the Fortran begins to depend on that
(Also Python version, but I think it's safe to assume it is determined by the operating system version, rather than independently variable)

Ideally, there would be an automatic build that, jointly across those dimensions:

Confirms the model builds and runs,
Measures and reports numerical discrepancies in the answers, and
Measures and reports runtime and memory use variation
On low-resolution examples with complete code coverage.

Even more ideally, would characterize how variations scale with resolution and run length.

Note 1: A complete matrix test would be overkill; the best that can be hoped for would be random sampling of configurations within the space, possibly with extra attention to extremes like the lastest and earliest supported versions of everything.

Note 2: Getting all the dimensions at once would be overkill too; incremental progress consists of adding one dimension at a time, starting with those most likely to cause trouble due to varying across installations.

This subsumes what's left of Issue #29.

Mar 08 '17 12:03 axch

Of these, the optimisation flag should be the easiest to implement - we can simply loop through a number of them in the python tests. We might also expect this to have one of the largest impacts on the output. Seems like a good place to start to me.

Mar 08 '17 21:03 edoddridge

A sub-choice of this is to choose what Fortran standard to aim for. Assuming Fortran is source-level backward compatible, the polite thing is to use the oldest standard that has the features necessary for the program to work, so that it will compile correctly under the greatest variety of tools.

Considerations:

The file extension .f90 suggests, to a naive reader, that it's meant to be Fortran 90.
Gfortran has switches that appear to be for checking compliance to the Fortran 95, Fortran 2003, and Fortran 2008 standards, but, as far as I am aware, not Fortran 90. Do we have access to a Fortran 90 compiler or conformance checker to test with? Will anyone else want to a compiler that only understands Fortran 90 to run MIM?

Mar 10 '17 05:03 axch

The code is currently aimed at the Fortran 90 format, but the changes between F90 and F95 are relatively minor. Fortran 90 was chosen because it is the oldest Fortran standard that allows free form programming.

I suspect gfortran is likely to be the most common compiler. Given that it has a check for F95 compliance, and that the F95 standard does deal somewhat with allocatable arrays I think it makes sense to switch to F95.

Mar 10 '17 14:03 edoddridge

aronnax aronnax copied to clipboard

Automatically measure execution and result consistency across platforms

aronnax
aronnax copied to clipboard