otter-sql How should aggregate work in the VM

How should aggregate work in the VM

Open tyt2y3 opened this issue 3 years ago • 0 comments

Reading through some existing code, still want to start a discussion.

https://github.com/SeaQL/sql-assembly/blob/e38f564785a6ba8a3542d512690ef90a059274d4/src/ic.rs#L373-L427

I think aggregate should be like project, it has a source table and a destination table, and some expressions to evaluate. It's just the evaluation model is different
In essence, how 'group by' works conceptually: for each row (there may be multiple group by columns), construct a tuple from the values of that row. Use that tuple as the key in a 'hash map' (it should be our own table + index implementation), where we run the 'reduce' function for each collision.
we might be missing an Aggregate instruction here, and the expressions to evaluate
The having construct in SQL is way too powerful, for example, we can do HAVING MAX(col3) + 1 > 10, where we have to recognize that MAX(col3) is an expression already evaluated, and we still have to evaluate the +1 part. It might be that our eager execution model does not align too well with SQL. Anyway we can leave this problem for later. May be right now we only allow a binary operator to be used as having clauses and the left operand must match one of the aggregate expressions

Or is there some simpler way to implement this?

Aug 06 '22 09:08 tyt2y3