materialize icon indicating copy to clipboard operation
materialize copied to clipboard

Likely mis-optimization in `SELECT` subquery

Open frankmcsherry opened this issue 2 years ago • 9 comments

What version of Materialize are you using?

main

How did you install Materialize?

Materialize Cloud

What is the issue?

The commands

create table test1 (a int, b int);
create table test2 (a int, b int);
create view test3 as select a, b, count(*) as c from test2 group by a, b;

insert into test1 values (1, 3);
insert into test2 values (1, 2), (1, 3);

select a, b, ( select c from test3 where a = test3.a and b = test3.b) from test1;

results in

CREATE TABLE
CREATE TABLE
CREATE VIEW
INSERT 0 1
INSERT 0 2
ERROR:  Invalid data in source, saw retractions (1) for row that does not exist: [Int32(1), Int32(3), Null]

It appears to be important that test3 has a primary key (using test2 directly in its place does not provoke the error).

Further things that do and do not reproduce the error:

materialize=> select a, ( select c from test3 where a = test3.a ) from test1;
ERROR:  Evaluation error: more than one record produced in subquery
materialize=> select b, ( select c from test3 where b = test3.b ) from test1;
ERROR:  Evaluation error: more than one record produced in subquery
materialize=> create view test4 as select a, count(*) as c from test2 group by a;
CREATE VIEW
materialize=> create view test5 as select b, count(*) as c from test2 group by b;
CREATE VIEW
materialize=> select a, ( select c from test4 where a = test4.a ) from test1;
 1 | 2
materialize=> select b, ( select c from test5 where b = test5.b ) from test1;
ERROR:  Invalid data in source, saw retractions (1) for row that does not exist: [Int32(3), Null]
materialize=> 

The evaluation errors seem fine, and are probably the correct answers.

The plan for the final query is

materialize=> explain select b, ( select c from test5 where b = test5.b ) from test1;
 Source materialize.public.test1 (u73):    +
 | Project (#1)                            +
                                           +
 Source materialize.public.test2 (u74):    +
 | Filter (#1 = #1)                        +
 | Project (#1)                            +
                                           +
 Query:                                    +
 %0 = Let l0 =                             +
 | Get materialize.public.test2 (u74)      +
 | Filter (#1 = #1)                        +
 | Project (#1)                            +
 | Reduce group=(#0)                       +
 | | agg count(true)                       +
 | Project (#1)                            +
                                           +
 %1 =                                      +
 | Get materialize.public.test1 (u73)      +
 | Project (#1)                            +
 | ArrangeBy ()                            +
                                           +
 %2 =                                      +
 | Get %0 (l0)                             +
 | Project ()                              +
 | Negate                                  +
                                           +
 %3 =                                      +
 | Constant ()                             +
                                           +
 %4 =                                      +
 | Union %2 %3                             +
 | Map null                                +
                                           +
 %5 =                                      +
 | Union %0 %4                             +
                                           +
 %6 =                                      +
 | Join %1 %5                              +
 | | implementation = Differential %5 %1.()+
 

materialize=> 

Relevant log output

No response

frankmcsherry avatar Aug 09 '22 17:08 frankmcsherry

In postgres the same result returns:

alexander> create table test1 (a int, b int);
 create table test2 (a int, b int);
 create view test3 as select a, b, count(*) as c from test2 group by a, b;

 insert into test1 values (1, 3);
 insert into test2 values (1, 2), (1, 3);

 select a, b, ( select c from test3 where a = test3.a and b = test3.b) from test1;
CREATE TABLE
CREATE TABLE
CREATE VIEW
INSERT 0 1
INSERT 0 2
more than one row returned by a subquery used as an expression

Time: 0.021s

Is the issue here that we are not throwing the same error?

aalexandrov avatar Aug 09 '22 20:08 aalexandrov

If we ever return

ERROR:  Invalid data in source, saw retractions (1) for row that does not exist: [Int32(3), Null]

we have introduced a collection with negative multiplicities. This seems like some flavor of consistency violation.

frankmcsherry avatar Aug 09 '22 20:08 frankmcsherry

Yeah I mean is this an issue for a query which is ill-defined either way. Looking at the definition

select a, b, ( select c from test3 where a = test3.a and b = test3.b ) from test1;

I think there is a slight chance that what you really want to say is:

select a, b, ( select c from test3 where a = test1.a and b = test1.b ) from test1;

I just want to double-check on that before I dig deeper.

aalexandrov avatar Aug 09 '22 20:08 aalexandrov

The issue is only about the inconsistency. The query is probably "the wrong query", but we shouldn't be introducing negative multiplicity collections without erroring. So, the bug is probably "postgres gives a clear error; we only give an error if you look at the results, and might not if you materialize them, and then use them in another query".

frankmcsherry avatar Aug 09 '22 20:08 frankmcsherry

Got it.

aalexandrov avatar Aug 09 '22 20:08 aalexandrov

select a, b, ( select c from test3) from test1;

produces the expected error, whereas

select a, b, ( select c from test3 where a = a and b = b ) from test1

and the from simpler

select a, b, ( select c from test3 where a = a and b = b ) from test1 -- equivalent to the above
select a, b, ( select c from test3 where a = a ) from test1 -- only the first predicate

only the first one has this issue, so I think the first thing is to look at differences in the plans produced by those three.

aalexandrov avatar Aug 09 '22 20:08 aalexandrov

Looks like an optimization bug, the decorrelated plans for the two queries are similar, but the optimized plans are different. Assigning to myself and I will find somebody to look at this ASAP.

aalexandrov avatar Aug 09 '22 20:08 aalexandrov

We can simplify and look just at:

select ( select c from test3 where a = a and b = b );

vs

select ( select c from test3 where a = a );

aalexandrov avatar Aug 09 '22 20:08 aalexandrov

Assigning this to @ggevay. Please take a look at that after you have opened a PR for the IN list planning.

aalexandrov avatar Aug 10 '22 07:08 aalexandrov