Interesting matrix-free optimization
Someone suggested an interesting optimization to CVXcanon. Right now CVXcanon takes an expression tree where each parent node represents a linear function and converts it to a map of variable to matrix coefficient (then to a single matrix). The algorithm essentially replaces each parent node with an explicit sparse matrix and then collapses the tree by multiplying out the matrices.
But there's no need to form the matrices at each parent node. You can just have the parent node evaluate it's function on the input coefficients. For example, if a node's function is to extract index i, it would extract row i of each of its input coefficient matrices. Does this make sense? It's not super clear how much better this would be, but at least you would avoid forming a lot of matrices.
Is this basically the idea of a presolve -- that we could avoid creating matrices by evaluating and combining appropriate terms in advance? i.e. combining Ax + Bx into (A+B)x before forming matrices? Yes, I think this would be a very good idea. Is there any existing literature on presolve algorithms for LinOp trees?
In the index example that you used, simplification would only be possible if you are extracting that index from a constant expression, correct? Or is there something that you think could be done even in the case of extracting indices from variables?
This is different from a presolve, though I also think a presolve would be great. Here's a clearer example of what I mean. Suppose we have the expression (Ax + By)[0]. That makes the tree
index [0]
|
+
| |
A B
| |
x y
Suppose we run CVXcanon on this tree. When we're evaluating the index[0] node, it will have the coefficient inputs {x:A, y:B}. Normally what we do is form a sparse matrix C that represents index[0], then we multiply C by all the coefficients. So we get {x:CA, y:CB}. But we could just directly carry out the indexing operation on A and B. So we reduce A to A[0,:]=CA and B to B[0,:] = CB, i.e, we have output {x:A[0,:], y:B[0,:]}. The gain here is we never form the matrix C and we possibly can evaluate CA and CB more efficiently.
Hi Steven, I started to work on this on a new branch, feature/directly_apply_linops. Indexing and summing now occur with no intermediate matrix calculation. Are there any other linops that you think it could make sense to try this sort of strategy on? I think you mentioned transpose at the meeting yesterday.
Got it working for transpose, too.
This is great! Thanks for doing this. Does it speed up the matrix stuffing?
On Sun, May 29, 2016 at 12:47 AM pquiggles [email protected] wrote:
Got it working for transpose, too.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/cvxgrp/CVXcanon/issues/19#issuecomment-222347575, or mute the thread https://github.com/notifications/unsubscribe/AAfyIK0eIggon-i8ugIIfQv11LIHG4Syks5qGUR6gaJpZM4ITK0Z .
So I'm seeing improved performance on most problems, and some performance drops on a few... but ironically I think this is almost entirely due to the fact that I changed matrices from being column major to row major in this branch. I'll investigate more fully and let you know tomorrow.
Any update on this? Matrices should probably be column major.