ingraph
ingraph copied to clipboard
How to calculate and produce count of empty aggregations
Handling aggregations incrementally is tricky. For example, the simple query of
MATCH (n)
RETURN count(n)
should return a single row (containing 0
) for an empty database. So, the philosophical question is, for an empty aggregation set, do we return an 0
or nothing?
The issue can be demonstrated with a PostgreSQL console:
postgres=#
select count(a)
from (select 1 as a) as subq
where a = 2;
count
-------
0
(1 row)
However, if we aggregate for a
:
postgres=#
select count(a)
from (select 1 as a) as subq
where a = 2
group by a;
count
-------
(0 rows)
Or, if you think aggregating for a
is ugly, we can aggregate for b
:
postgres=#
select count(a)
from (select 1 as a, 1 as b) as subq
where a = 2
group by b;
count
-------
(0 rows)
The issues of handling null
s for OPTIONAL MATCH
clauses is also related...
It's worth checking the Postgres docs:
If a query contains aggregate function calls, but no GROUP BY clause, grouping still occurs: the result is a single group row (or perhaps no rows at all, if the single row is then eliminated by HAVING). The same is true if it contains a HAVING clause, even without any aggregate function calls or GROUP BY clause.
end of 7.2.3: https://www.postgresql.org/docs/9.6/static/queries-table-expressions.html#QUERIES-GROUP
via @jmarton
Related literature (thanks to @bergmanngabor): https://dl.acm.org/citation.cfm?id=137852
This causes BI Q7 to break.
A simplified version shows the issue:
MATCH (message2:Message)
OPTIONAL MATCH (message2:Message)<-[like:LIKES]-(p3:Person)
RETURN message2.id AS m, count(like) AS likes
ingraph results: List(ArrayBuffer((likes,1), (m,44)), ArrayBuffer((likes,1), (m,88)))
neo4j results: List(ArrayBuffer((likes,1), (m,44)), ArrayBuffer((likes,1), (m,88)), ArrayBuffer((likes,0), (m,99)))
The problem can be pinpointed to the following condition in AggregationNode
s:
if (oldValues != newValues)
Of course, this line is there for a reason...
Currently, this breaks BI Q21.