feat: add support for aggregate ordering where sensible
Postgres and other systems support an ORDER BY statement in various operations where ordering is meaningful, e.g., string_agg/array_agg. We should support this in ibis as well.
One question is whether this should be included in all aggregates, including Sum, Mean etc.
This could be useful, but let's wait until someone asks for it.
This could be useful, but let's wait until someone asks for it.
Could we please have this? 😛
I'm wanting to do _.column.collect(order_by=_.timestamp)[-1] but can't.
@cpcloud Can we reopen this?
I don't think there is currently a way to express SQL like:
SELECT ARRAY_AGG(value ORDER BY timestamp DESC LIMIT 1) AS latest_value
FROM data
GROUP BY user
Yep!
@judahrand In the mean time if you're blocked, you can always use .sql for something that doesn't (yet) have an ibis equivalent.
I think for the DESC LIMIT 1 syntax we won't bake that into collect() but will allow this syntax:
ibis.array(t.select("value").order_by(ibis.desc("timestamp")).limit(1).subq()) # subq doesn't exist
Otherwise aggregate functions end up with a bunch of mostly unused arguments that emulate a SELECT statement.