mindsdb [CUSTOMER REQUEST] Enforce specific time granularity on order

[CUSTOMER REQUEST] Enforce specific time granularity on order_by column

Open paxcema opened this issue 2 years ago • 2 comments

Checklist:

[ ] 1) [MindsDB] should always convert order_by columns to pd.datetime
[ ] 2) [mindsdb_sql] Add SAMPLE BY support to enforce specific time intervals
[ ] 3) [mindsdb_sql] Robust support for aggregation when selecting columns (e.g. AVG(prices), LATEST(ID)) for deduping.
[ ] 4) [MindsDB] Create views using the above.

Example

Create a view that samples on five minute intervals, aggregating the other columns with AVG and LAST:

CREATE VIEW mytrainingdata FROM (
   SELECT time, AVG(col1), LAST(col2) FROM (
         (SELECT * FROM shopify_integration.orders)
    )
    SAMPLE BY 5m
)

Train a predictor using this view:

CREATE PREDICTOR mindsdb.shopify_model
FROM views.mytrainingdata
PREDICT col1
ORDER BY time
GROUP BY col2
WINDOW 10
HORIZON 5;

Note

Keeping the Lightwood issues open for now, as we're still not 100% decided on feature design and scope.

Jul 29 '22 17:07 paxcema

Meeting notes

Size could be an issue
LAST instead of LATEST
Could be a problem
Some way to handle grouping, e.g. GROUPBY(col2)
Storing views: not necessary? We just store and re-run the query. Could be slow at prediction time (would have to recreate view). Possibly store in a central postgres database, same place as predictors etc
Meeting Mon to discuss further

Aug 12 '22 14:08 tomhuds

Was discussing this one with @Ricram2 just now. An update:

We are prioritizing doing this natively in the data source whenever possible. If not, we will have some subset of the above implemented as a function in the SQL feature engineering project.

Nov 04 '22 17:11 paxcema

Closing. All originally mentioned fixes have either been implemented or are part of the roadmap (FESQL project).

Dec 16 '22 18:12 paxcema

mindsdb mindsdb copied to clipboard

[CUSTOMER REQUEST] Enforce specific time granularity on order_by column

Checklist:

Example

Note

mindsdb
mindsdb copied to clipboard