sst-core icon indicating copy to clipboard operation
sst-core copied to clipboard

Floating point changes with newer XCode

Open gvoskuilen opened this issue 2 years ago • 4 comments

XCode 14.3+ defaults to -ffp-contract=fast which causes test failures due to differences in floating point statistics. Setting -ffp-contract=off fixes the failure but a more general solution for these kinds of changes across compilers would be to add tolerances to floating point statistics in the test pass criteria.

gvoskuilen avatar Oct 30 '23 17:10 gvoskuilen

There are a few ways do to this. The main problem is that difflib, part of the Python standard library, has no facility for handling numbers, only text.

  • Use a third-party compiled tool (https://www.nongnu.org/numdiff/). It's stable and I think relatively standardized in HPC.
  • Implement some numeric diffing. Reading difflib.unified_diff output, which is already processed, is possibly a non-starter, since there's no guarantee of the ordering of the - and + lines. Some modification of the difflib code (https://github.com/python/cpython/blob/3be9b9d8722696b95555937bb211dc4cda714d56/Lib/difflib.py#L1095) is required.
    • There is not an equivalent of numdiff for Python that I could quickly find.

Numdiff output looks like

$ numdiff -s ';\t\n ' /Users/ejberqu/development/sst/github/sst-core/tests/refFiles/test_StatisticsComponent_basic.out /Users/ejberqu/development/sst/github/sst_test_outputs/run_data/test_StatisticsComponent_basic.out
----------------
##151     #:13  <== 6729315.500000
##151     #:13  ==> 6729316.000000
@ Absolute error = 5.0000000000e-1, Relative error = 7.4301762193e-8
----------------
##152     #:13  <== 6729315.500000
##152     #:13  ==> 6729316.000000
@ Absolute error = 5.0000000000e-1, Relative error = 7.4301762193e-8
----------------
##153     #:13  <== 6729315.500000
##153     #:13  ==> 6729316.000000
@ Absolute error = 5.0000000000e-1, Relative error = 7.4301762193e-8
----------------
##333     #:13  <== 19143114.000000
##333     #:13  ==> 19143116.000000
@ Absolute error = 2.0000000000e+0, Relative error = 1.0447621009e-7
----------------
##334     #:13  <== 19143114.000000
##334     #:13  ==> 19143116.000000
@ Absolute error = 2.0000000000e+0, Relative error = 1.0447621009e-7
----------------
##335     #:13  <== 19143114.000000
##335     #:13  ==> 19143116.000000
@ Absolute error = 2.0000000000e+0, Relative error = 1.0447621009e-7
----------------
##414     #:13  <== 25604938.000000
##414     #:13  ==> 25604940.000000
@ Absolute error = 2.0000000000e+0, Relative error = 7.8109933326e-8
----------------
##415     #:13  <== 25604938.000000
##415     #:13  ==> 25604940.000000
@ Absolute error = 2.0000000000e+0, Relative error = 7.8109933326e-8
----------------
##416     #:13  <== 25604938.000000
##416     #:13  ==> 25604940.000000
@ Absolute error = 2.0000000000e+0, Relative error = 7.8109933326e-8

+++  File "/Users/ejberqu/development/sst/github/sst-core/tests/refFiles/test_StatisticsComponent_basic.out" differs from file "/Users/ejberqu/development/sst/github/sst_test_outputs/run_data/test_StatisticsComponent_basic.out"
$ echo $?
1
$ numdiff -s ';\t\n ' -r 1.0e-4 /Users/ejberqu/development/sst/github/sst-core/tests/refFiles/test_StatisticsComponent_basic.out{,2}

+++  Files "/Users/ejberqu/development/sst/github/sst-core/tests/refFiles/test_StatisticsComponent_basic.out" and "/Users/ejberqu/development/sst/github/sst-core/tests/refFiles/test_StatisticsComponent_basic.out2" are equal
$ echo $?
0

berquist avatar Mar 22 '24 15:03 berquist

I don't remember the exact details, but the reason we switched to difflib instead of just invoking diff like we used to is that we were getting strange artifacts that we couldn't work around when invoking diff through Python. @gvoskuilen, do you remember the exact details? I'm a little concerned that going to an externally compiled utility could lead to those issues again.

feldergast avatar Mar 22 '24 17:03 feldergast

I would rather a first-party pure Python implementation, and will give it an honest try; my concern is that it will be flaky and hard to maintain. On the other hand, the testing infrastructure code seems largely untouched by most people so it shouldn't be a point of contention for frequent modification.

berquist avatar Mar 22 '24 20:03 berquist

We are going to pass on adding statistics tolerance for this problem and instead reuse https://github.com/sstsimulator/sst-core/commit/51f2786f5aaa755d8796b39c56bbf737c2b53485.

berquist avatar May 08 '24 19:05 berquist

Closed by #1074

berquist avatar May 21 '24 13:05 berquist