bigrquery Update the bq_perform_query to allow user to pass in time partitioning and clustering

Update the bq_perform_query to allow user to pass in time partitioning and clustering

Open SeagleLiu opened this issue 4 years ago • 3 comments

Time partitioning an clustering are important properties of bq table to reduce cost and improve performance. These two options will allow users to specify time partition and clustering when the new table is created from the query results.

Aug 04 '20 17:08 SeagleLiu

Fixed the roxygen issue.. Now all checks should have passed except for the mac dev one.

Aug 04 '20 18:08 SeagleLiu

@SeagleLiu,

Just note that you can already create an empty table with partition using bigrquery::bq_table_create():

clustering <- list("field1")
day.partitioning <- list(type = "DAY")
bigrquery::bq_table_create(
      tbl,
      fields = read_json(schema.file),
      timePartitioning = day.partitioning,
      clustering = clustering
)

Nov 20 '20 15:11 byapparov

@byapparov Thanks for pointing this out. This PR is meant to allow the timePartitioning to be specified when the table is created from queries. In our use cases, we are joining/filter some big tables to create new ones. The original bq_perform_query did not pass on these two parameters.

Nov 23 '20 22:11 SeagleLiu

If you're still interested in this, would you mind filing an issue? I think the current implementation is a bit too minimal, and I'm worried that it'll make it too easy for the user to shot themselves in the foot.

Nov 07 '23 23:11 hadley

bigrquery bigrquery copied to clipboard

Update the bq_perform_query to allow user to pass in time partitioning and clustering

bigrquery
bigrquery copied to clipboard