nrn
nrn copied to clipboard
Replace random number implementations in `src/gnu` with std library and random123 gen
Current state of random distribution implementations that need to be changes in NEURON:
- [ ] ACG (random number generator that needs removal?) this is a https://en.cppreference.com/w/cpp/numeric/random/linear_congruential_engine
- [ ] MLCG (random number generator that needs removal?)
- [ ] Isaac64
- [ ] MCellRan4 (Should stay as it is?)
- [ ] Random123 (Should stay as it is?)
- [x] Uniform
- [x] DiscUnif
- [x] Normal
- [x] LogNormal (need to figure out a way to turn log-variance to log-stddev to give as argument to std::lognormal_distribution)
- [x] Binomial
- [x] Poisson
- [x] Geometric
- [ ] HyperGeometric
- [x] NegExp
- [x] Erlang
- [x] Weibull
- [ ] RandomPlay
Other things that need to be taken care of:
- [ ] Remove temporary classes/functions and add them to the proper ones
- [ ] Maybe refactor the way
RNG
andRandom
classes are? - [ ] Add documentation
- [ ] Add tests
- [ ] Compare old with new implementations
Just a thought, would it be beneficial to add tests in a prequel PR? This way we'd have a base
to compare against?
Just a thought, would it be beneficial to add tests in a prequel PR? This way we'd have a
base
to compare against?
I think that's a good idea. I'm just not sure how to test these random distributions. I was having in mind initializing a Random
struct like the python scripts I used as tests and then compare different distribution's mean or variance with the expected ones and check whether they are close but this might be fragile and doesn't check the actual distribution. Any suggestion would be more than welcome for this
Any suggestion would be more than welcome for this
@nrnhines would you have some ideas?
how to test these random distributions
Use Random123 as the generator, pick 5 values from each distribution (restart generator for each distribution), and compare with the values prior to this PR. To facilitate comparison, consider using nrn/test/cover/checkresult.py which can create the rand_dist.json file with the prior values and thereafter compare using those.
Don't worry much about other generators. mcellran4 may be machine independent but I'm not sure about the ones in the old gnu folder.
Coming back to this PR, what we would like to do before updating the branch and merging it is compare the output of the refactored random number distribution functions of this PR with the output of the legacy functions. Apart from calling all the functions a few thousand times, creating a histogram and comparing the new and old implementations manually, is there some other way we can automatically check them and create a unit test for that? I'm also summoning for suggestions some people with more math expertise than me here @1uc @cattabiani
even in that case it is not so simple. I suggest to look into https://en.wikipedia.org/wiki/Chi-squared_test for example. However, there is no way to have a test that passes 100% of the time. With the standard 95% p we would have a test that fails once every 20 runs which is not ideal for a CI. Is it really necessary to test these libraries? Proving something with statistics is alsways complicated and prone to errors. "there are lies, damn lies and then there is statistics"
If you want to compute a distance between two distributions, you can use something like a Wasserstein, or Earth movers, distance. If you have two histograms (or sampled distributions) this metric computes the minimum amount of "mass" you need to move in one histogram to get the other histogram. You can then use this as an error. Just like abs(approx - exact)
.
There's a Python implementation readily available for 1D distributions in scipy here: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.wasserstein_distance.html
This is a optimal transport library I've used in the past, also deals with the multi-dimensional case: https://pythonot.github.io/quickstart.html
If you're just asking the question: Are these samples from the same distribution? Then Kolmogorov–Smirnov would seem to fit the bill: https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test
With regards to the p = 0.05
problem that @cattabiani mentioned. It's true that we'll probably not be able to write a test that never fails spuriously. However, we want to rule out programming errors. Almost all of them are going to completely change the distribution. Hence, the samples of the two distributions will be very different; and statistical tests will be very certain that the samples are from different distributions. For example:
In [1]: import scipy
In [2]: import numpy as np
In [3]: x = np.random.normal(1.0, 0.1, size=100)
In [4]: y = np.random.normal(1.01, 0.1, size=100)
In [5]: scipy.stats.ks_2samp(x, y)
Out[5]: KstestResult(statistic=0.08, pvalue=0.9084105017744525, statistic_location=0.8547517139777675, statistic_sign=1)
In [6]: y = np.random.normal(1.1, 0.1, size=100)
In [7]: scipy.stats.ks_2samp(x, y)
Out[7]: KstestResult(statistic=0.43, pvalue=1.1151678185620634e-08, statistic_location=1.0632304755992028, statistic_sign=1)
With a 100 samples, we can almost detect a 1% difference; a 10% difference has a p-value of virtually 0.
What we want is that the tests (almost) never fail due to bad sampling luck; but fails reasonably often if the distributions are different. Pick, say, p = 0.001
we'd expect 1:1000 CI runs to fail. If this is still too high, you can automatically rerun and report everything is green if 1:3 is green or 2:3 is green; whatever you prefer. Reasoning behind this is that the programming error you're looking for will cause the p-value to be 1e-6 or similar. They'll fail the test 3 times in a row; almost every time you run CI. Hence, they'll still make CI go red.
What I don't know is how sensitive these tests are to sampling from the same distribution but with different random number generators.
what does "almost never fail" mean? Because even if we are using the very same distribution a p value test will fail once every 20 runs. Do you want to run something 100 times, do statistics with the p value 100 times every time there is a CI running? I dunno the sample size but we are talking about 10^{3+} of run tests.
Spuriously, not once in 100'000 times using the above numbers.
Thanks a lot @1uc and @cattabiani for the detailed answers and sorry for the delayed reply but I've only seen them now. I think what's needed to move forward with this PR is then to:
- Merge
master
to this branch. - Clean code (remove some old indirection).
- Figure out whether
LogNormal
,HyperGeometric
andRandomPlay
are used anywhere (ModelDB) and need to be implemented. IMHO if not we shouldn't care about these in the following steps - Create tests based on Luc's suggestions that test picking numbers for the currently implemented NEURON built-in random distributions from
master
to some reference implementations from Python or C++ - Update this branch to implement the selected distributions using the
Random123
random number generator. Most of the implementations of the random number distributions are already there. Supposing that there are no big issues the tests added in step 4. should pass
On our discussion in the latest NEURON dev meeting we concluded the following:
- All current random number generator implementations should be removed in favor of Random123
- The current random number generator API should be kept but they should still use Random123 underneath
- In version 8.x there should be a deprecation message for the random number generators to be removed in version 9 mentioning Random123