bigrquery icon indicating copy to clipboard operation
bigrquery copied to clipboard

Update the bq_perform_query to allow user to pass in time partitioning and clustering

Open SeagleLiu opened this issue 4 years ago • 3 comments

Time partitioning an clustering are important properties of bq table to reduce cost and improve performance. These two options will allow users to specify time partition and clustering when the new table is created from the query results.

SeagleLiu avatar Aug 04 '20 17:08 SeagleLiu

Fixed the roxygen issue.. Now all checks should have passed except for the mac dev one.

SeagleLiu avatar Aug 04 '20 18:08 SeagleLiu

@SeagleLiu,

Just note that you can already create an empty table with partition using bigrquery::bq_table_create():

clustering <- list("field1")
day.partitioning <- list(type = "DAY")
bigrquery::bq_table_create(
      tbl,
      fields = read_json(schema.file),
      timePartitioning = day.partitioning,
      clustering = clustering
)

byapparov avatar Nov 20 '20 15:11 byapparov

@byapparov Thanks for pointing this out. This PR is meant to allow the timePartitioning to be specified when the table is created from queries. In our use cases, we are joining/filter some big tables to create new ones. The original bq_perform_query did not pass on these two parameters.

SeagleLiu avatar Nov 23 '20 22:11 SeagleLiu

If you're still interested in this, would you mind filing an issue? I think the current implementation is a bit too minimal, and I'm worried that it'll make it too easy for the user to shot themselves in the foot.

hadley avatar Nov 07 '23 23:11 hadley