sea-query icon indicating copy to clipboard operation
sea-query copied to clipboard

Support window functions like `LEAD(...) OVER (...)` and `LAG(...) OVER (...)` (and others)

Open LeoniePhiline opened this issue 2 years ago • 1 comments

Related to #344

Ref:

  • https://mariadb.com/kb/en/window-functions/
  • https://dev.mysql.com/doc/refman/8.0/en/window-function-descriptions.html
  • https://www.postgresql.org/docs/15/tutorial-window.html

Motivation

I am aggregating data and need to compare data to fields of next and previous rows, in order to:

  • Determine if a previous row is present
  • Calculate a percentage change between the current and next row.

My use case is this (MariaDB):

Table (just to give you context for the query further below - I am not currently looking to build this CREATE TABLE statement in sea-query):

CREATE TABLE tx_digibordauthentication_activity
(
  uid          bigint unsigned    NOT NULL auto_increment,

  date_time    datetime           NOT NULL,

  person       int(11) unsigned   NOT NULL DEFAULT '0',
  institution  int(11) unsigned   NOT NULL DEFAULT '0',
  subscription int(11) unsigned   NOT NULL DEFAULT '0',

  year_month      varchar(7) AS (DATE_FORMAT(CONVERT_TZ(UTC_TIMESTAMP, "UTC", "Europe/Amsterdam"), '%Y-%m')) PERSISTENT,
  year_week       varchar(7) AS (DATE_FORMAT(CONVERT_TZ(UTC_TIMESTAMP, "UTC", "Europe/Amsterdam"), '%x-W%v')) PERSISTENT,
  year_month_day  varchar(10) AS (DATE_FORMAT(CONVERT_TZ(UTC_TIMESTAMP, "UTC", "Europe/Amsterdam"), '%Y-%m-%d')) PERSISTENT,

  PRIMARY KEY (uid),

  KEY institution_last_activity (institution, date_time),
  KEY institution_monthly (institution, year_month, person),
  KEY institution_weekly (institution, year_week, person),
  KEY institution_daily (institution, year_month_day, person),

  KEY subscription_last_activity (subscription, date_time),
  KEY subscription_monthly (subscription, year_month, person),
  KEY subscription_weekly (subscription, year_week, person),
  KEY subscription_daily (subscription, year_month_day, person),
);

Query - here sea-query does not support all I need - i.e. LEAD(...) OVER (ORDER BY ...) -, and some stuff gets very tricky even with Expr::cust_with_exprs() etc.:

SELECT p.`year_month`,
  COUNT(a.uid) AS all_activity,
  IF(
    LAG(p.`year_month`) OVER (ORDER BY p.`year_month` DESC) IS NULL,
    NULL,
    (COUNT(a.uid) / (LEAD(COUNT(a.uid)) OVER (ORDER BY p.`year_month` DESC)) - 1) * 100
  ) AS all_activity_change_percent,
  COUNT(DISTINCT a.year_month_day) AS active_days,
  IF(
    LAG(p.`year_month`) OVER (ORDER BY p.`year_month` DESC) IS NULL,
    NULL,
    (COUNT(DISTINCT a.year_month_day) / (LEAD(COUNT(DISTINCT a.year_month_day)) OVER (ORDER BY p.`year_month` DESC)) - 1) * 100
  ) AS active_days_change_percent,
  COUNT(DISTINCT a.person) AS active_personnel,
  IF(
    LAG(p.`year_month`) OVER (ORDER BY p.`year_month` DESC) IS NULL,
    NULL,
    (COUNT(DISTINCT a.person) / (LEAD(COUNT(DISTINCT a.person)) OVER (ORDER BY p.`year_month` DESC)) - 1) * 100
  ) AS active_personnel_change_percent
FROM (
  SELECT DISTINCT `year_month`
  FROM tx_digibordauthentication_activity
  ORDER BY `year_month` DESC
) p
LEFT JOIN tx_digibordauthentication_activity a
  ON a.year_month = p.year_month
    AND a.institution = 2050
GROUP BY p.`year_month`
ORDER BY p.`year_month` DESC;

Proposed Solutions

Add Expr::lead_over(), Expr::lag_over() and other window functions. Resulting, if possible, in Expr, as SimpleExpr is quite unergonomic.

Maybe there should be a trait for aggregate functions, which allow to call .window() on it. Not sure how to solve this the most ergonomic and idiomatic?

As a workaround, I tried to build the window functions as Expr::cust_with_exprs() with a Func::cust() for the LAG(...) and a WindowStatement::order_by for the OVER (...). But all that won't work (yet?), as WindowStatement has no Into<SimpleExpr>, so it can't be used as exprs iterable for cust_with_exprs (and is generally made only for WindowSelectType / SelectStatement::window*).

Additional Information

I am missing a few things here and I will open separate issues for these:

  • COUNT(DISTINCT <table-ref>) #600
  • IF(<cond>, <then-expr>, <else-expr>) #602

LeoniePhiline avatar Feb 09 '23 20:02 LeoniePhiline

+1

ilxqx avatar Nov 18 '23 06:11 ilxqx