dyna
dyna copied to clipboard
use overlapping variables to help choose among plans?
In the max-product algorithm, I wrote this (where L is a label on the edge):
ecount(L,U,V) += alpha(U)*edge(L,U,V)*beta(V).
It seemed to be extremely slow (I had to abort it). I am guessing that the planner picked a suboptimal order alpha(U), beta(V), edge(L,U,V)
. When I forced a smarter binarization it worked fine:
alpha(L,U,V) += alpha(U)*edge(L,U,V). % temp variable
ecount(L,U,V) += alpha(L,U,V)*beta(V).
I know we don't really have plan costs yet, but could we do any better here using simple heuristics? The good plan here only starts with one completely free subquery alpha(U)
, whereas the bad plan starts with two completely free subqueries alpha(U),beta(V)
.
I can confirm that the planner picked alpha(U), beta(V), edge(L,U,V)
as the order.
Without a cardinality estimate there is no reason to believe that edge/3
is any smaller than alpha \cross beta
... or even related to alpha
and beta
(static analysis can probably tell us that)... I suppose this is even the case with a binary relation edge(U,V)
-- with out the L
.
Yes, I know there are no guarantees and we really want cardinality. I was suggesting a heuristic that might tend to work in the meantime (i.e., better than nothing) so that the students can do the assignment. Maybe you have compelling counterexamples? But my thinking was that when there are overlapping variables, they are "intended" to be used to restrict the cardinality, and furthermore that there are diminishing returns (so grounding a second variable as in the current planner's plan is not as helpful as grounding the first one).
Anything you know about det and multi should override this. If binding a variable drives the mode from multi
to det
, you already know that helped. And if the mode was already det
, then binding the variable can't really improve it. My suggestion was only that going from multi
to multi
should be assumed to involve some kind of cardinality reduction.
Without a cardinality estimate there is no reason to believe that edge/3 is any smaller than alpha \cross beta... or even related to alpha and beta (static analysis can probably tell us that)...
The idea of using static analysis to get subsets is intriguing. It's probably not going to work here, though, because the U positions of the different subqueries are not in a subset relation. The terms that can show up in U position for alpha are a subset of things that can show up in V position for edge (as well as the start vertex). Thus, there may be reachable states U with no out-edges, and unreachable states that do have out-eges.
The planner is picking alpha(U), beta(V), edge(L,U,V)
as each such query binds two free variables (including the result of is
). The alternative, alpha(U), edge(L,U,V), beta(V)
, requires running out is edge(out,in,out)
, which we've told it is much worse than can be made up for the fact that the next query is out is beta(in)
. I don't understand (sorry) the alternative heuristic you are proposing?
Let me try to understand your current heuristic first. Why "much worse"? In both cases, you have nested loops over 3 variables, just in different orders (U,V,L versus U,[L,V]) and with different tests.
The current heuristic (exponentially) prefers steps that bind fewer variables to ones that bind more. Thus the binding of three variables in out is edge(out,in,out)
is much worse than the binding of two in out is edge(out,in,in)
as far as it is concerned. This is countered by a preference for plans with fewer loops, but apparently not enough to help here, or we are not tracking determinism correctly (it would not shock me), or both.
Yes, but if you bind 3 variables rather than 2 in the edge
goal, then this pays off in the next step whenyou get to bind 0 rather than 1 in the beta
goal. So how do you trade these things off against each other? Is it greedy? Or do you mean that the costs are something like (exp 3 + exp 0) for the first plan which is worse than (exp 2 + exp 1) for the second plan? If you were aggregating as exp 3 * exp 0 versus exp 2 * exp 1, that would be a wash, i.e., they both have 3 nested loops