velox icon indicating copy to clipboard operation
velox copied to clipboard

Implement window functions in Velox

Open aditi-pandit opened this issue 3 years ago • 8 comments
trafficstars

Window functions https://prestodb.io/docs/current/functions/window.html#functions-window are a special category of analytic functions in SQL.

These are heavily used in TPC-DS benchmark and also critical to other client SQL workloads.

This issue tracks their implementation in Velox.

Efficient Processing of Window Functions in Analytical SQL Queries VLDB paper: http://www.vldb.org/pvldb/vol8/p1058-leis.pdf

Window functions in DuckDB: https://duckdb.org/docs/sql/window_functions

aditi-pandit avatar Dec 01 '21 18:12 aditi-pandit

CC: @beroyfb @pedroerp

mbasmanova avatar Dec 01 '21 18:12 mbasmanova

PR #1246 PR #1692 PR #1691

mbasmanova avatar May 24 '22 22:05 mbasmanova

Doc https://docs.google.com/document/d/1YiBv7tuWK8lqXTXcgyTWkLYFEA1Rama8wrifdL5nNFw/edit has a description of the functional requirements for Window functions and high-level design thinking.

There are many TODO's in the doc that are being actively worked on.

aditi-pandit avatar May 26 '22 05:05 aditi-pandit

PR #1695

aditi-pandit avatar May 26 '22 05:05 aditi-pandit

Per offline discussion with @aditi-pandit , the plan is:

  • Add support for row_number, rank and nth_value functions. This requires changes to the WindowFunction interface to pass channel indices of the inputs and calculation of peers in the operator.
  • Add integration with Presto.

This will give us first milestone with 3 functions working end to end with default frame.

Next steps would be (1) cut up PRs and get these reviewed and landed (2) do performance evaluation and identify bottlenecks (3) optimize based on results from (2) (4) implement frames (5) add support for any aggregate function to be used in a window operator (6) add more window functions.

Items 3, 4, 5 can be worked on in parallel. Item 6 has a dependency on 4.

mbasmanova avatar May 26 '22 17:05 mbasmanova

PR https://github.com/facebookincubator/velox/pull/1816

aditi-pandit avatar Jun 15 '22 17:06 aditi-pandit

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

stale[bot] avatar Sep 14 '22 20:09 stale[bot]

Assuming this is still in progress - re-opening it.

pedroerp avatar Oct 07 '22 03:10 pedroerp

Functional work pending: i) Support for k preceding/k following rows/range frames. ii) Functions first_value, last_value iii) Functions lead, lag

Performance optimizations WIP Spill to disk WIP

aditi-pandit avatar Nov 18 '22 22:11 aditi-pandit

Hello @aditi-pandit , I saw is ignoreNulls defined https://github.com/facebookincubator/velox/blob/d0e1acfddfb1918957a0aceb215dbc01acecd82d/velox/core/PlanNode.h#L1876 but not used anywhere. Do you have any plan to support "IGNORE NULLS" for value functions like nth_value etc? Thanks.

Yes, this needs to be handled.

zhli1142015 avatar May 08 '23 07:05 zhli1142015

@aditi-pandit Aditi, is anyone working on adding lead and lag functions? If not, I'm going to add these.

mbasmanova avatar May 30 '23 20:05 mbasmanova

Hi @mbasmanova,

There are some PRs in progress for lead/lag https://github.com/facebookincubator/velox/pull/2942. We are planning to prepare for submission.

aditi-pandit avatar May 30 '23 22:05 aditi-pandit

Hello @aditi-pandit , I saw is ignoreNulls defined

https://github.com/facebookincubator/velox/blob/d0e1acfddfb1918957a0aceb215dbc01acecd82d/velox/core/PlanNode.h#L1876

but not used anywhere. Do you have any plan to support "IGNORE NULLS" for value functions like nth_value etc? Thanks. Yes, this needs to be handled.

Hi @zhli1142015, Thanks for pointing that out. Yes, they need to be handled. There are plans to support them.

aditi-pandit avatar May 30 '23 22:05 aditi-pandit

@aditi-pandit

There are some PRs in progress for lead/lag https://github.com/facebookincubator/velox/pull/2942.

Thank you for the pointer. The PR is very old though and doesn't seem to be actively worked on. What is the ETA for it to be ready for review?

mbasmanova avatar May 30 '23 22:05 mbasmanova

@mbasmanova : Pramod is back from his vacation end of this week. He will pick it up after that.

aditi-pandit avatar May 30 '23 22:05 aditi-pandit

Closing this issue since https://github.com/facebookincubator/velox/issues/5411 is the only functionality of Presto window not available in Velox now.

aditi-pandit avatar Jun 06 '24 23:06 aditi-pandit