ibis icon indicating copy to clipboard operation
ibis copied to clipboard

feat: add support for aggregate ordering where sensible

Open cpcloud opened this issue 4 years ago • 1 comments

Postgres and other systems support an ORDER BY statement in various operations where ordering is meaningful, e.g., string_agg/array_agg. We should support this in ibis as well.

cpcloud avatar Sep 02 '21 22:09 cpcloud

One question is whether this should be included in all aggregates, including Sum, Mean etc.

cpcloud avatar Jan 18 '22 21:01 cpcloud

This could be useful, but let's wait until someone asks for it.

cpcloud avatar Apr 02 '23 10:04 cpcloud

This could be useful, but let's wait until someone asks for it.

Could we please have this? 😛

I'm wanting to do _.column.collect(order_by=_.timestamp)[-1] but can't.

judahrand avatar Nov 07 '23 19:11 judahrand

@cpcloud Can we reopen this?

I don't think there is currently a way to express SQL like:

SELECT ARRAY_AGG(value ORDER BY timestamp DESC LIMIT 1) AS latest_value
FROM data
GROUP BY user

judahrand avatar Nov 09 '23 16:11 judahrand

Yep!

cpcloud avatar Nov 09 '23 16:11 cpcloud

@judahrand In the mean time if you're blocked, you can always use .sql for something that doesn't (yet) have an ibis equivalent.

cpcloud avatar Nov 09 '23 18:11 cpcloud

I think for the DESC LIMIT 1 syntax we won't bake that into collect() but will allow this syntax:

ibis.array(t.select("value").order_by(ibis.desc("timestamp")).limit(1).subq())  # subq doesn't exist

Otherwise aggregate functions end up with a bunch of mostly unused arguments that emulate a SELECT statement.

cpcloud avatar Nov 09 '23 18:11 cpcloud