chaospy
chaospy copied to clipboard
An error when using LogNormal distribution
Hello. I am trying to use ChaosPy for uncertainty quantification of a 5 dimensional problem using PCE. The input variables are set to be LogNormal. For one of the variables mean is about 500 and SD is 270. For this variable I get the following error from the log_normal.py:
""""""""""""""""""""""
File "C:\Users\User\anaconda3\lib\site-packages\chaospy\distributions\baseclass.py", line 367, in
File "C:\Users\User\anaconda3\lib\site-packages\chaospy\distributions\evaluation\recurrence_coefficients.py", line 72, in evaluate_recurrence_coefficients coeff1, coeff2 = distribution._ttr(k_data, **parameters)
File "C:\Users\User\anaconda3\lib\site-packages\chaospy\distributions\collection\log_normal.py", line 31, in _ttr (numpy.e**(naa)(numpy.e**(aa)+1)-1)numpy.e*(.5*(2*n-1)aa), \
OverflowError: (34, 'Result too large') """"""""""""""""""""""" Could you please help me fix this issue if possible? Thanks, Amir
The parameters mu
and sigma
refers to the mean and SD of the cooresponding Gaussian distribution (before the log-transformation). When you set mu=500
and sigma=270
you get È[X] ~= e^125000
after the transformation, which is on a scale that will cause numpy to overflow.
So quick question: Is this wat you want, or do you want a log-normal distribution that has the properties E[X] = 500
and SD(X) = 270
? Both are solvable, but I need to know which direction you are going before I answer.
Thank you for your quick response. I want a log-normal distribution that has the properties E[X] = 500 and SD(X) = 270.
Then you need to use the method of moments. For log-normal that is a bit involved, but it is described here:
https://www.real-statistics.com/?p=30118
Plug in \bar x=500
and s=270
and calculated the corresponding mu
and sigma
for your problem.
I used the method of moments as you explained above and was able to run the code without error. However, the standard deviation I get from the ChaosPy (0.48) is smaller than the standard deviation I get from the Monte Carlo simulation (0.53), or polynomial chaos using Dakota (0.53). Also, the quadrature nodes ChaosPy uses is 286 for sparse grid level 3, while Dakota uses 401 for the same sparse grid level. I think I have probably missed something that has caused this discrepancy.
The input uncertainties are 5 log-normal distributions, and I used the Gaussian rule in ChaosPy. Attached is the script I used to generate Chaospy results.
I would greatly appreciate it if you can help fix this discrepancy.
PCE with log-normal is notouriously know for being problematic, both analytically and in practice. Dakota defines it as "transformed Normal", which I think means they are doing generalized-PCE. This is also what I will recommend is best practice in chaospy. Give me a week and I will finally make a tutorial on g-PCE which has been on my agenda for too long now.
As to the number of quadrature nodes:
Optional Gaussian quadrature rules (rule="gaussian"
) do not nest much. Un-nested rules does not work well with sparse grid. I don't know the details, but I think Dakota changes the quadrature rules to Genz-Keister, which is both nested and works well with Gaussian distribution. Note that Chaospy also supports rule="genz_keister"
, but I am not sure if the same Genz-Keister rule, as there are a few variants.
Thanks a lot for the great information you provided. I look forward to going through the tutorial when it would be available.
Hi, You should read https://www.esaim-m2an.org/articles/m2an/pdf/2012/02/m2an110045.pdf to have an in-depth understanding of what happen when you try to use the orthogonal polynomial family associated to the log-normal distribution. To sum up the main points:
- this distribution is not determined by its moments (other distributions share the same moments)
- as a result, the associated family of orthogonal polynomials is not an Hilbert basis of the associated functional space: it means that even if the series converge, its limit can be arbitrary far away from your function (depending on the function of course. If it is a polynomial it's ok) Cheers
Thank you @regislebrun, much appriciated.
@ambehnam, I've med a new tutorial here: https://chaospy.readthedocs.io/en/master/tutorials/stochastic_dependencies.html not coined towards your problem in particular, as I use dependent random variables as the focus, but the solution should be valid in your case as well.
For you problem use standard Gaussian for the R distribution.
Thank you very much for the detailed tutorial. I tried to replicate it by adding the following few lines to my code:
distribution_q = cp.J(cp.LogNormal(gamma_mu_g, gamma_sig_g), cp.LogNormal(a0_mu_g, a0_sig_g), cp.LogNormal(b0_mu_g, b0_sig_g), cp.LogNormal(D_mu_g, D_sig_g), cp.LogNormal(sig0_mu_g, sig0_sig_g))
distribution_r = cp.J(cp.Normal(0, 1), cp.Normal(0, 1), cp.Normal(0, 1), cp.Normal(0, 1), cp.Normal(0, 1))
def transform(samples): return distribution_q.inv(distribution_r.fwd(samples))
nodes_sparse, weights_sparse = cp.generate_quadrature(quad_deg_1D, distribution_r , rule='genz_keister', sparse=True) nodes_sparse_q = transform(nodes_sparse)
I was wondering if these lines seem to be correct?
Also, I wanted to mention that using the genz_keister rule fixed the discrepancy between Chaospy and Dakota. However, if the input variable dimension is increased from 5 to 14 for the same problem, the mean and std are outputted as very large numbers. Does this mean Polynomial Chaos cannot handle this dimension with quadrature order 3 or 4?
Thanks again,
Yes, that seems correct, though you are outside the edge of what chaospy has been tested for (g-pce+log-normal+high-dim+genz-keister+sparse). I just ran the code on an synthetic example, and it look like it works fine, but I'd love to hear about hear how it works for you.
As for your follow-up question: what do you mean "very large numbers?" Like your first problem by going beyond numpy max limit of 1e308?
And quadrature order are typically not limited upwards, but Genz-Keister is. I didn't write that code, I borrowed it from the Buquardt repo, and there the values are hard coded. I don't know if this is because the rule is finite, or if no-one took the time to actually create a general rule, but that is what is being used here and by most others I have seen using G-K.
I found the sentence in the original paper: "In general, we found that we could not obtain sequences with more than 5-7 steps, and in any event were limited by degree 67 in obtaining solutions with real roots." In other words: G-K can not be made larger.
Thank you very much for the comprehensive information. I will make a comparison between Dakota, Chaospy and a MC simulation to see how it would work and will share the results soon.
By very large number, I meant a variance of 1370 instead of 0.6. I tried Gaussian and Genz-Keister rules and quadrature orders 2,3,4 and 5, but until now I haven't been able to get a valid result for the problem with 14 uncertain variables. However, the results for 5 dimensional problem including g-pce+log-normal+high-dim+genz-keister+sparse matches Monte Carlo simulation well.
That would be appriciated if you post the results here so I know what chaospy is capable of.
Variance of 1370 is not an issue in and of itself, but 14 dims is high compare to the problems I worked on during my research.
Note that there are 4 different Genz-Keister quadrature rule: gk16, gk18, gk22 and gk24. They are quite similar, but have slightly different growth rules resulting in different nodes and orders. Chaospy uses gk24, but have internal support for the others as well.
If you need it, I can add a patch to expose them all to the interface you are using.