chaospy
chaospy copied to clipboard
Error while creating MvNormal distributions with more than 31 random variables
I recently upgraded my chaospy version from 2.3.5 to 3.0.4, and noticed that I am no longer able to create MvNormal
distributions with more than 31 random variables. Specifically, if I try to execute the lines in the interactive mode
jdist = cp.MvNormal(np.zeros(32), np.eye(32))
cp.E(jdist)
I get the following error
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/kinshuk/miniconda3/envs/check_chaospy/lib/python3.7/site-packages/chaospy/descriptives/expected.py", line 58, in E
mom = dist.mom(numpy.array(keys).T, **kws)
File "/home/kinshuk/miniconda3/envs/check_chaospy/lib/python3.7/site-packages/chaospy/distributions/baseclass.py", line 339, in mom
out = [evaluation.evaluate_moment(self, kdata, cache) for kdata in K.T]
File "/home/kinshuk/miniconda3/envs/check_chaospy/lib/python3.7/site-packages/chaospy/distributions/baseclass.py", line 339, in <listcomp>
out = [evaluation.evaluate_moment(self, kdata, cache) for kdata in K.T]
File "/home/kinshuk/miniconda3/envs/check_chaospy/lib/python3.7/site-packages/chaospy/distributions/evaluation/moment.py", line 93, in evaluate_moment
out = distribution._mom(k_data, **parameters)
File "/home/kinshuk/miniconda3/envs/check_chaospy/lib/python3.7/site-packages/chaospy/distributions/collection/mv_normal.py", line 110, in _mom
K = numpy.mgrid[[slice(0,_+1,1) for _ in k]]
File "/home/kinshuk/miniconda3/envs/check_chaospy/lib/python3.7/site-packages/numpy/lib/index_tricks.py", line 172, in __getitem__
nn = _nx.indices(size, typ)
File "/home/kinshuk/miniconda3/envs/check_chaospy/lib/python3.7/site-packages/numpy/core/numeric.py", line 1966, in indices
res = empty((N,)+dimensions, dtype=dtype)
ValueError: sequence too large; cannot be greater than 32
I tried bypassing this situation by trying to create a joint distribution using cp.J
as
dist1 = cp.MvNormal(np.zeros(31), np.eye(31))
dist2 = cp.MvNormal(np.zeros(31), np.eye(31))
jdist = cp.J(dist1, dist2)
But I get an Assertion Error
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/kinshuk/miniconda3/envs/check_chaospy/lib/python3.7/site-packages/chaospy/descriptives/expected.py", line 58, in E
mom = dist.mom(numpy.array(keys).T, **kws)
File "/home/kinshuk/miniconda3/envs/check_chaospy/lib/python3.7/site-packages/chaospy/distributions/baseclass.py", line 339, in mom
out = [evaluation.evaluate_moment(self, kdata, cache) for kdata in K.T]
File "/home/kinshuk/miniconda3/envs/check_chaospy/lib/python3.7/site-packages/chaospy/distributions/baseclass.py", line 339, in <listcomp>
out = [evaluation.evaluate_moment(self, kdata, cache) for kdata in K.T]
File "/home/kinshuk/miniconda3/envs/check_chaospy/lib/python3.7/site-packages/chaospy/distributions/evaluation/moment.py", line 93, in evaluate_moment
out = distribution._mom(k_data, **parameters)
File "/home/kinshuk/miniconda3/envs/check_chaospy/lib/python3.7/site-packages/chaospy/distributions/operators/joint.py", line 169, in _mom
output *= evaluation.evaluate_moment(dist, kloc_, cache=cache)
File "/home/kinshuk/miniconda3/envs/check_chaospy/lib/python3.7/site-packages/chaospy/distributions/evaluation/moment.py", line 74, in evaluate_moment
"distribution %s is not of length %d" % (distribution, len(k_data)))
AssertionError: distribution MvNormal(loc=[0.0,....,0.0], scale=[[1.0,...,1.0]]) is not of length 1
I am using numpy 1.16.4, and scipy 1.2.1 to reproduce these errors. I have been treating chaospy as a black box so far, so would greatly appreciate your assistance in getting a workaround!
The former syntax is correct, and that is a bug. Good catch. I will start working on a fix. I'll let you know when it is in place.
The solution to the problem required a little bit of a workaround, which results in the code being a bit slower at calculating moments for MvNormal now. That being said, it should work as intended.
Update to version 3.0.5 and let me know if it works out for you.
Thank you for such a quick response. Unfortunately, the workaround is rather slow. I have to create a distribution with up to 256 random variables, at which point computing the mean and variance from the distribution take in excess of 45 minutes on a personal computer.
Yes, 256 random variables is quite a lot, and above what was the intended use for chaospy. The fact that this bugfix required me to move some numpy functionality to pure python really doesn't help.
Short term,
I should point out that the function you are using is full multivariate normal distribution, but as long as you are using numpy.eye
as the covariance matrix, you can simplify the construction down to chaospy.Iid(chaospy.Normal(0, 1), 256)
. It is still slow, but hopefully fast enough for what you are trying to do. Please let me know if that suffices to solve your problem.
Medium term, it should be fairly straight forward to move the two pure python components to something like C, C++ or Cython. I will have to tinker a bit to set up the machinery for it in place. But I think that your use case is an interesting one, and I'd definitely want to see how many dimension in a Gaussian distribution chaospy can scale up to.
Note to self (or @Oo1Insane1oO, if you so graciously want to help):
- Add pybind11 machinery
- Move
chaospy.bertran.operator:bindex
to C++ (with numpy arrays instead pure python objects) - Move
chaospy.distribution.collection.mv_normal:MvNormal._mom
to C++
I can take a look at it tomorrow @jonathf if still needed.
@jonathf thanks for pointing me to this. By the way, if, for any arcane reason, you prefer Fortran to C++ you could also have a look at https://github.com/pyccel/fffi instead of pybind11.