malloy
malloy copied to clipboard
Window Functions
Add support for window functions. A few we'd probably like to support:
- LAG / LEAD : "how long did it take for User to get from step A to step B?" "How long are people typically idle after they do x action?" "What was the previous status of this item?"
- FIRST_VALUE / LAST_VALUE / NTH_VALUE - important for event/session data. "what was the first action User took? What was the first/last action or landing/exit page in User's session?"; grabbing the "first" ever referrer for a user across many sessions, filling in each row of a historical state table from EAV data. etc
- RANK / ROW_NUMBER - often for sequencing things like transactions "What percent of purchases come from first-time vs newish vs seasoned customers?" or just for generating a primary key where there isn't one, or even for funnel analysis / finding given value in a partition
- Percentile functions (including MEDIAN), helpful for visualizing the data and seeing the distribution
We had a request from someone in Slack wanting to do a rolling average today. Throwing in an approach that works for now in case others are searching for this:
query: rolling_average_line_chart is table('malloy-data.ecomm.order_items'){
primary_key: id
}
-> {
nest: orders_1 is {
group_by: order_date is created_at.day
aggregate: total_sales is sale_price.sum()
}
nest: orders_2 is {
group_by: order_date is created_at.day
aggregate: total_sales is sale_price.sum()
}
}
-> {
where:
orders_2.order_date < orders_1.order_date,
orders_2.order_date >= orders_1.order_date - 7 days
project: order_date is orders_1.order_date
project: total_sales_2 is orders_2.total_sales
}
-> {
group_by: order_date
aggregate: rolling_average is total_sales_2.sum() / 7
}
but this will be muuuuch nicer when we can support window functions and/or if we provide a smooth way to invoke a rolling average.
Documenting another example; this gets you FIRST_VALUE
/ LAST_VALUE
. To figure out the first product each user ever ordered:
query: user_first_product is order_items -> {
group_by: user_id
nest: first_product is {
group_by:
product_name is inventory_items.product_name
created_at
order_by: created_at asc
limit: 1
}
} -> {
project: user_id, first_product.product_name
}
is this feature (enabling window functions) being worked upon? 🤔
Yes. It lands in pre-release tomorrow.
On Mon, Jun 12, 2023, 1:10 PM Yadunandan Batchu @.***> wrote:
is this feature (enabling window functions) being worked upon? 🤔
— Reply to this email directly, view it on GitHub https://github.com/malloydata/malloy/issues/46#issuecomment-1588018991, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIK6UUZEVM3OLYUBOTC5JTXK5ZRVANCNFSM5E6QZJDA . You are receiving this because you were assigned.Message ID: @.***>
Is supporting PARTITION BY
within the calculate analytic/window function on the round map?