sqlflow icon indicating copy to clipboard operation
sqlflow copied to clipboard

[Discussion] SQL (or extended) statement to convert the data of sliding windows to rows.

Open brightcoder01 opened this issue 5 years ago • 2 comments

Requirement

There are some scenarios of time series forecasting in real world, such as forecasting the pv/uv or the traffic load from the history data. The sample data is as follows:

app_id time uv pv
1 1 1 5
1 2 2 6
1 3 3 7
1 4 4 8
2 1 11 12
2 2 13 14
2 3 15 16
2 4 17 18

The table above contains the data of two entities/app_id 1 and 2. Each entity has the uv/pv in 4 time steps 1, 2, 3, 4.

The common data transform function for time series data is convert the data of sliding windows to rows. If the window size is 2, the result table is:

app_id time uv uv_1 uv_2 pv pv_1 pv_2
1 1 1 NULL NULL 5 NULL NULL
1 2 2 1 NULL 6 5 NULL
1 3 3 2 1 7 6 5
1 4 4 3 2 8 7 6
2 1 11 NULL NULL 12 NULL NULL
2 2 13 11 NULL 14 12 NULL
2 3 15 13 11 16 14 12
2 4 17 15 13 18 16 14

Points:

  1. Do the existed SQL syntax (MaxCompute / Hive / MySQL) support this transformation? If yes, how complex is it?
  2. Do we need extend the SQL syntax in SQLFlow (such as TO RUN clause) ?

brightcoder01 avatar May 13 '20 23:05 brightcoder01

MaxCompute SQL:

SELECT 
    app_id, 
    time,
    uv,
    LAG(uv, 1) OVER (PARTITION BY app_id ORDER BY time) AS uv_1,
    LAG(uv, 2) OVER (PARTITION BY app_id ORDER BY time) AS uv_2,
    pv,
    LAG(pv, 1) OVER (PARTITION BY app_id ORDER BY time) AS pv_1,
    LAG(pv, 2) OVER (PARTITION BY app_id ORDER BY time) AS pv_2
FROM ts_data
ORDER BY app_id, time;

PS:If the windows size is larger (10 or more), the SQL statement will be long.

brightcoder01 avatar May 14 '20 01:05 brightcoder01

We will add TO RUN statement in SQLFlow. #2355 We can wrap the logic of convert the data of sliding windows to rows into python program and build it into a docker image. SQLFlow can invoke this functionality using TO RUN statement. In this way, user don't need write a long SQL to complete this task.

brightcoder01 avatar Jun 01 '20 13:06 brightcoder01