ibis icon indicating copy to clipboard operation
ibis copied to clipboard

feat(bigquery): add `bucket`

Open dixdotdev opened this issue 1 year ago • 0 comments

Is your feature request related to a problem?

So I've become kind of a big fan of the time_bucket function in Ibis which works great in DuckDB, but isn't in BigQuery so I've written a UDF for BigQuery which achieves what I regard as a pretty similar function.

I'm not much of an OSS contributor due to time constraints, and I honestly can't see where your UDFs for BigQuery are so I'm kind of just throwing this out there in the hopes that one of you lovely people has the time to add it, or modify it, or rewrite it, and an appropriate test to Ibis :grinning: sorry?

CREATE TEMP FUNCTION time_bucket(bucket_width STRING, ts TIMESTAMP, offset STRING)
RETURNS TIMESTAMP AS (
  TIMESTAMP_SECONDS(
    (UNIX_SECONDS(ts) -
    (CASE
      WHEN ENDS_WITH(offset, ' minute') THEN CAST(SPLIT(offset, ' ')[OFFSET(0)] AS INT64) * 60
      WHEN ENDS_WITH(offset, ' hour') THEN CAST(SPLIT(offset, ' ')[OFFSET(0)] AS INT64) * 3600
      WHEN ENDS_WITH(offset, ' day') THEN CAST(SPLIT(offset, ' ')[OFFSET(0)] AS INT64) * 86400
      ELSE CAST(SPLIT(offset, ' ')[OFFSET(0)] AS INT64)
    END))
    -
    MOD(
      (UNIX_SECONDS(ts) -
      (CASE
        WHEN ENDS_WITH(offset, ' minute') THEN CAST(SPLIT(offset, ' ')[OFFSET(0)] AS INT64) * 60
        WHEN ENDS_WITH(offset, ' hour') THEN CAST(SPLIT(offset, ' ')[OFFSET(0)] AS INT64) * 3600
        WHEN ENDS_WITH(offset, ' day') THEN CAST(SPLIT(offset, ' ')[OFFSET(0)] AS INT64) * 86400
        ELSE CAST(SPLIT(offset, ' ')[OFFSET(0)] AS INT64)
      END)),
      (CASE
        WHEN ENDS_WITH(bucket_width, ' minute') THEN CAST(SPLIT(bucket_width, ' ')[OFFSET(0)] AS INT64) * 60
        WHEN ENDS_WITH(bucket_width, ' hour') THEN CAST(SPLIT(bucket_width, ' ')[OFFSET(0)] AS INT64) * 3600
        WHEN ENDS_WITH(bucket_width, ' day') THEN CAST(SPLIT(bucket_width, ' ')[OFFSET(0)] AS INT64) * 86400
        ELSE CAST(SPLIT(bucket_width, ' ')[OFFSET(0)] AS INT64)
      END)
    )
  )
);

Usage:

SELECT time_bucket('15 minute', TIMESTAMP "2024-04-11 14:41:00", '0 minute') as bucket_ts;

Returns: 2024-04-11 14:30:00

What is the motivation behind your request?

Feature parity with the DuckDB backend

Describe the solution you'd like

As written above, an equivalent time_bucket function for the BigQuery backend.

What version of ibis are you running?

8.0.0

What backend(s) are you using, if any?

DuckDB and BigQuery

Code of Conduct

  • [X] I agree to follow this project's Code of Conduct

dixdotdev avatar Apr 12 '24 08:04 dixdotdev