QuantEcon.py
QuantEcon.py copied to clipboard
Improvements for quadrature routines
Since the quad branch is closed, I'm opening a new issue with the comments I made about it, so that we can start a fresh discussion. There are currently a few issues with the quadrature routines. The two items can have side-effects on future code and should be fixed quickly.
- the
gridmake
function does not enumerate points in a way that is consistent with default Python conventions. If you computegridmake(array([0,1]), ([-1,2]))
it produces:
array([[ 0, -1],
[ 1, -1],
[ 0, -2],
[ 1, -2],
meaning that the first order varies faster. Now, if you want to represent values on this grid by a 2d array vals
such that vals[i,j]
contains values, then when you do vals.ravel(), you don't enumerate points in the same order because last index is supposed to vary faster. This one is quite annoying.
- Multidimensional functions don't always return an array with the same number of dimensions. That is not a problem in Matlab, but in Python, if I do
quad.qnwnorm([2,2])
I get 2-dimensional arrays and with qnwnorm([2]) I get 1-dimensional vectors. This is problematic, since in generic code, it will force one to always distinguish dimension 1 from higher dimensions. My opinion here is that multidimensional routines, should always return multidimensional objects in a predictible fashion. (maybe the 1-d routines could be exported too) - (minor) It looks like qnwnorm will fail if one column of the covariance matrix (and the corresponding line) is full of zeros. That was already the case in the compecon toolbox, but it is still a common usecase.
- (minor) in the future, we may want to have more engaging names than
qnwnorm
and co, don't we ?
A possible way to deal with these issues would be to rename quad.py into ce_quad.py so that the latter can be left untouched and remain as close as possible from the original version (including Fortran order). A quadrature.py
could then contain Python compliant versions, possibly with more explicit names.
As for the gridmake
replacement, the function cartesian
does the required thing. It is also faster. (cf http://stackoverflow.com/questions/1208118/using-numpy-to-build-an-array-of-all-combinations-of-two-arrays)
Actually all these issues concern only the multidimensional functions.
Not sure I understand when you would use a column of zeros in a covariance matrix, doesn't that mean that the variance of that random variable would be zero?
I think everything else (especially the names) is a good idea for discussion.
Imagine you have a model that depends on two i.i.d. gaussian shocks. You can write the covariance matrix as:
sigma = array([[sigma1, 0],[0, sigma2]])
Assume that you have written a routine to solve your model involving a discretization of the (trivial) multivariate distribution of shocks. Now, if you want to see what happens to your model, if the second shock is abbsent, you can just set sigma2=0 and run the same routine. Currently you will probably get an error because of the Cholesky factorization step.
On Wed, Jul 30, 2014 at 10:21 PM, cc7768 [email protected] wrote:
Not sure I understand when you would use a column of zeros in a covariance matrix, doesn't that mean that the variance of that random variable would be zero?
I think everything else (especially the names) is a good idea for discussion.
— Reply to this email directly or view it on GitHub https://github.com/jstac/quant-econ/issues/35#issuecomment-50673990.
The current grid_make
function in ce_utils
was a very quick and dirty way to get just the functionality we needed for these routines to match the Matlab. I am happy to use a different function, I have actually had to use that cartesian
function before.
If we do move away from gridmake
and use cartesian
I think that all the tests that compare the python and matlab versions will brake, as they currently assume a column-major ordering. Pablo, do you have any suggestions on what we can do to get around this issue?
In the end I would love to have tests that were checking for the "correct answer" rather than simply verifying that we match some other code. Does anyone have any references we can turn to for computing the correct answers for any of these routines? If we can separate our tests from the Matlab versions, then I would love to move on and make this feel more pythonic.
I will look into the dimensionality issues and probably write some tests for them.
I haven't checked simple (yet important) things like that yet, so this is great feedback.
I'm in favor of changing the names to something more informative. Let's not tie ourselves down to exactly replicating CompEcon. As long as we implement broadly similar functionality and have good documentation I think it's fine to deviate when the deviation involves an obvious improvement.
Regarding the suggestion to
"rename quad.py into ce_quad.py so that the latter can be left untouched and remain as close as possible from the original version (including Fortran order). A quadrature.py could then contain Python compliant versions, possibly with more explicit names."
My preference would be to avoid duplication, just have one version of these routines and go for Python compliance and more explicit names. Let's break compatibility if it leads to improvement. As long as we have good documentation I think that's fine.
About the reordering and subsequent testing issues. It's clear that current tests for multidimensional integration will break and will need to be rewritten. There are two simple ways to get around that:
- reorder the results from Matlab so that they match the Python ordering. If
T
is a Matlab arrays, where rows enumerate a cross product in $R^{d1} \times R^{d2}$ in Matlab order, then you can defineind=array(range(d1*d2)).reshape(d1,d2,order='F')).flatten(order='C')
. The reordered matrix corresponding to C order would then beT[ind,:]
. - use some test functions and check that integrated values are the same using the weights and nodes supplied by Matlab or the ones computed in Python. This is a necessary condition only since it doesn't ensure that the new ordering is indeed conform, but it's easy to implement. Later, if you choose polynomials as test functions you should be able to compute some of the values in closed form.
I've been looking at this issue again since the cartesian
routine has been renamed. Replacing Fortran order by C order in quad.py
seems very easy (as in replacing ckron(*weights[::-1])
by ckron(*weights)
and gridmake
by cartesian
.
Another idea would be to create a keyword argument order='C'
so that Fortran order would still be an option. I have the feeling that it would require some restructuring of the quad
code, and I don't know if it would be worth the trouble. @cc7768, @spencerlyon2 : since you wrote the first version, what are your thoughts about that ?