sea-query
sea-query copied to clipboard
Support window functions like `LEAD(...) OVER (...)` and `LAG(...) OVER (...)` (and others)
Related to #344
Ref:
- https://mariadb.com/kb/en/window-functions/
- https://dev.mysql.com/doc/refman/8.0/en/window-function-descriptions.html
- https://www.postgresql.org/docs/15/tutorial-window.html
Motivation
I am aggregating data and need to compare data to fields of next and previous rows, in order to:
- Determine if a previous row is present
- Calculate a percentage change between the current and next row.
My use case is this (MariaDB):
Table (just to give you context for the query further below - I am not currently looking to build this CREATE TABLE
statement in sea-query):
CREATE TABLE tx_digibordauthentication_activity
(
uid bigint unsigned NOT NULL auto_increment,
date_time datetime NOT NULL,
person int(11) unsigned NOT NULL DEFAULT '0',
institution int(11) unsigned NOT NULL DEFAULT '0',
subscription int(11) unsigned NOT NULL DEFAULT '0',
year_month varchar(7) AS (DATE_FORMAT(CONVERT_TZ(UTC_TIMESTAMP, "UTC", "Europe/Amsterdam"), '%Y-%m')) PERSISTENT,
year_week varchar(7) AS (DATE_FORMAT(CONVERT_TZ(UTC_TIMESTAMP, "UTC", "Europe/Amsterdam"), '%x-W%v')) PERSISTENT,
year_month_day varchar(10) AS (DATE_FORMAT(CONVERT_TZ(UTC_TIMESTAMP, "UTC", "Europe/Amsterdam"), '%Y-%m-%d')) PERSISTENT,
PRIMARY KEY (uid),
KEY institution_last_activity (institution, date_time),
KEY institution_monthly (institution, year_month, person),
KEY institution_weekly (institution, year_week, person),
KEY institution_daily (institution, year_month_day, person),
KEY subscription_last_activity (subscription, date_time),
KEY subscription_monthly (subscription, year_month, person),
KEY subscription_weekly (subscription, year_week, person),
KEY subscription_daily (subscription, year_month_day, person),
);
Query - here sea-query does not support all I need - i.e. LEAD(...) OVER (ORDER BY ...)
-, and some stuff gets very tricky even with Expr::cust_with_exprs()
etc.:
SELECT p.`year_month`,
COUNT(a.uid) AS all_activity,
IF(
LAG(p.`year_month`) OVER (ORDER BY p.`year_month` DESC) IS NULL,
NULL,
(COUNT(a.uid) / (LEAD(COUNT(a.uid)) OVER (ORDER BY p.`year_month` DESC)) - 1) * 100
) AS all_activity_change_percent,
COUNT(DISTINCT a.year_month_day) AS active_days,
IF(
LAG(p.`year_month`) OVER (ORDER BY p.`year_month` DESC) IS NULL,
NULL,
(COUNT(DISTINCT a.year_month_day) / (LEAD(COUNT(DISTINCT a.year_month_day)) OVER (ORDER BY p.`year_month` DESC)) - 1) * 100
) AS active_days_change_percent,
COUNT(DISTINCT a.person) AS active_personnel,
IF(
LAG(p.`year_month`) OVER (ORDER BY p.`year_month` DESC) IS NULL,
NULL,
(COUNT(DISTINCT a.person) / (LEAD(COUNT(DISTINCT a.person)) OVER (ORDER BY p.`year_month` DESC)) - 1) * 100
) AS active_personnel_change_percent
FROM (
SELECT DISTINCT `year_month`
FROM tx_digibordauthentication_activity
ORDER BY `year_month` DESC
) p
LEFT JOIN tx_digibordauthentication_activity a
ON a.year_month = p.year_month
AND a.institution = 2050
GROUP BY p.`year_month`
ORDER BY p.`year_month` DESC;
Proposed Solutions
Add Expr::lead_over()
, Expr::lag_over()
and other window functions. Resulting, if possible, in Expr
, as SimpleExpr
is quite unergonomic.
Maybe there should be a trait for aggregate functions, which allow to call .window()
on it. Not sure how to solve this the most ergonomic and idiomatic?
As a workaround, I tried to build the window functions as Expr::cust_with_exprs()
with a Func::cust()
for the LAG(...)
and a WindowStatement::order_by
for the OVER (...)
.
But all that won't work (yet?), as WindowStatement
has no Into<SimpleExpr>
, so it can't be used as exprs
iterable for cust_with_exprs
(and is generally made only for WindowSelectType
/ SelectStatement::window*
).
Additional Information
I am missing a few things here and I will open separate issues for these:
-
COUNT(DISTINCT <table-ref>)
#600 -
IF(<cond>, <then-expr>, <else-expr>)
#602
+1