pisa Analysis results are non-deterministic using deterministic minimizer and no fluctuations

Running

export CUDA_VISIBLE_DEVICES=0
export PISA_RESOURCES=/fastio/justin/pisa_resources
export PISA_FTYPE=fp32
for t in {1..100}
do
    OUTDIR=/tmp/test${t}
    $PISA/pisa/scripts/analysis.py discrete_hypo \
        --h0-pipeline settings/pipeline/example_mc.cfg \
        --h0-param-selections=ih \
        --h1-param-selections=nh \
        --data-param-selections=nh \
        --data-is-mc \
        --min-settings settings/minimizer/l-bfgs-b_ftol2e-5_gtol1e-5_eps1e-4_maxiter200.json \
        --metric=chi2 \
         --no-octant-check \
        --logdir $OUTDIR \
        --pprint -v \
        --allow-dirty
done

yields different results and it takes vastly numbers of iterations from run to run. Setting PISA_FTYPE=fp64 is better, taking the same number of iterations almost every time, but still the results are not quite the same (close enough for the analyses we run). Single precision, though, can vary a bit from run to run, with params changing by as much as 10% each time.

Comparing runs' minimizer history (looking at like *.json.bz2 log files), the chi2 values are coming back different from the beginning, while all param values are being reported as equal. Then after some number of iterations, the param values start to differ (starting with theta13 but then eventually all param values differ). So it's possible that the chi2 calculation is the root of this issue. That said, it could just as well be the actual maps that feed into that function that differ slightly from the beginning, which causes the chi2 values to differ.

It does not appear that l-bfgs-b is or should be introducing any randomness onto the process (quickly grepping through the Fortran source in scipy showed no "random" "rand" "rnd" or "seed" references, and the algo is documented to be deterministic). @philippeller has reported seeing similarly non-deterministic behavior when using the slsqp minimizer as well.

I'll look into it a little further, but possible solutions could include rounding values to known machine precision in Maps and/or rounding the value output by the chi2 function (and same for other metrics) to the precision of FTYPE.

Note if we do round values, it's possible that this should be done more carefully than simply rounding the mantissa to some number of decimal places, since ultimately the precision limitation is in base-2, and we could lose a lot more precision than necessary by rounding to decimals--possibly causing unnecessary "flat" regions in the metric space that could confuse a minimizer. (The down side to this is that it could be slower to do this operation,... need to check if there's a fast way, though, which'd be nice also for other places in the code where we perform rounding operations in base-10 now and so lose more precision than necessary.)

Jan 18 '18 16:01 jllanfranchi

https://github.com/scipy/scipy/issues/8677

I thought I'd put this link here before I forgot. Maybe it's related.

Jun 25 '18 13:06 thehrh

btw, i had recently played with the NLopt package, which i personally find superior to scipy's optimizers. They also provide implementation of the SLSQP and L-BFGS-B algorithms that we have grown fond of - apart from many more and non-local minimizers. All with a transparent interface......maybe of interest for your NSI study @thehrh ?

Jun 25 '18 13:06 philippeller

I think nowadays the results are pretty consistent up to statistical fluctuations :thinking:

Jun 03 '24 15:06 LeanderFischer

pisa pisa copied to clipboard

Analysis results are non-deterministic using deterministic minimizer and no fluctuations

pisa
pisa copied to clipboard