prql
prql copied to clipboard
Some aggregation functions need ORDER BY clause
let json_agg = func arr -> s"json_agg({arr})"
from invoices
sort total
aggregate (json_agg f"{month}: {total}")
(intention: I want a single row that is a json array, which contains "month: total" entries, sorted by total)
I think this PRQL should produce the following SQL:
SELECT
json_agg(month || ': ' || total ORDER BY total)
FROM invoices
The important bit is that json_agg
has its column ORDERed BY total.
This is because from invoices | sort total
produces a relation, ordered by total, so this ordering should be accounted by the aggregation function.
Yes agree.
What's the class of functions that require the ORDER BY
? There's WINDOW
, though possibly that's a separate issue entirely.
How would we encode this? We have a field on each function that specifies "include an ORDER BY
clause within the function"?
In PRQL, columns are arrays, not sets as they are in SQL. So columns do have an order they are in, which means that by default all aggregation function should take ordering into account.
So I think we could encode that in std.sql.prql
with an annotation:
@{aggregation_unordered = true}
let sum = func col -> s"..."
@{aggregation_unordered = false}
let json_agg = func col -> s"..."