otter-sql
otter-sql copied to clipboard
How should aggregate work in the VM
Reading through some existing code, still want to start a discussion.
https://github.com/SeaQL/sql-assembly/blob/e38f564785a6ba8a3542d512690ef90a059274d4/src/ic.rs#L373-L427
-
I think aggregate should be like project, it has a source table and a destination table, and some expressions to evaluate. It's just the evaluation model is different
-
In essence, how 'group by' works conceptually: for each row (there may be multiple group by columns), construct a tuple from the values of that row. Use that tuple as the key in a 'hash map' (it should be our own table + index implementation), where we run the 'reduce' function for each collision.
-
we might be missing an
Aggregateinstruction here, and the expressions to evaluate -
The
havingconstruct in SQL is way too powerful, for example, we can doHAVING MAX(col3) + 1 > 10, where we have to recognize thatMAX(col3)is an expression already evaluated, and we still have to evaluate the+1part. It might be that our eager execution model does not align too well with SQL. Anyway we can leave this problem for later. May be right now we only allow a binary operator to be used as having clauses and the left operand must match one of the aggregate expressions
Or is there some simpler way to implement this?