pygsvd icon indicating copy to clipboard operation
pygsvd copied to clipboard

Improve column sorting, add tests, etc...

Open ddrake opened this issue 6 years ago • 4 comments

Sorry about all the commits. I probably need to get in the habit of using rebase. I'll try to write up some documentation on the column sorting, rolling and truncation in Latex. I think the rest is pretty straightforward.

The X inverse transpose feature adds a dependency on scipy, but scipy.linalg.lapack has been a pretty stable part of that package since v 0.12. I haven't done any benchmarking yet, but inverting R by back substitution and using that to get inv(X.T) should be much faster than inverting X for large problems.

Feel free to make any changes, and let me know if you have any questions.

ddrake avatar Nov 13 '19 15:11 ddrake

Don't worry about the commits, we will squash them all at merge time. (But yes, rebase is a good strategy for keeping clean commit history!)

I'm not too concerned with the performance at this point. I think it's important to verify correctness, and then worry about the speed.

Again, thanks for your help, this looks like a lot of good work!

bnaecker avatar Nov 13 '19 15:11 bnaecker

I agree regarding performance, and thanks for the encouragement! It did take a bit of time to fully understand the sorting problem. I'm attaching a pdf here that will help me remember why/how that aspect of the code works. I hope it will make things clearer for you and the other contributors as well. gsvd.pdf

ddrake avatar Nov 14 '19 05:11 ddrake

I wanted to comment on the 'cases.txt file' that is used by 'test_multi_gsvd.py'. The main assertions there verify that (using the X1=True option so that X is the inverse transpose of the default X), the singular values extracted from the products U.T@A@X and V.T@B@X match those in the returned 1D arrays C and S respectively.

Those test cases were initially generated by considering possible permutations of m, p, n, r, l (and q = m+p) combined with possible orderings {'<', '='}. From these auto-generated cases, I eliminated by hand a number of impossible edge cases and a number of duplicate cases (there may still be a few duplicates). Testing the remaining cases, some of the cases I had thought would be possible, turned out not to be. For example, a number of cases could be eliminated based on the fact that after setting the rank of (A|B), the desired rank of B could not be achieved for the specified matrix dimensions. That's where the preliminary assertions in 'test_multi_gsvd.py' turned out to be very helpful.

ddrake avatar Nov 14 '19 19:11 ddrake

I'm attaching an updated version of the gsvd.pdf file that I uploaded last month gsvd.pdf. It fixes a couple typos and adds a section that explains the edge case in which it is not possible to place the singular values on the diagonal of V.T@B@X. This section includes two examples which show how the same issue occurs in Matlab, but in a different dimensional scenario due to the different sorting convention.

ddrake avatar Dec 09 '19 19:12 ddrake