cloudflare-gcp icon indicating copy to clipboard operation
cloudflare-gcp copied to clipboard

Add support for time partitioning and expiry

Open igorwwwwwwwwwwwwwwwwwwww opened this issue 4 years ago • 1 comments

The current behaviour is to load everything into a non-partitioned table. That means that queries will scan the entire table every time.

In order to make queries cheaper, we can use time partitioning.

One nice side-effect of this is that we also get the ability to configure expiration.

Note: I could use some help testing out this patch.

Hey @igorwwwwwwwwwwwwwwwwwwww thanks for taking a stab at this. In the past, creating ingestion-time partitioning for logs inserted via a load job has been non-trivial. The solution in your PR has not worked in the past (see https://github.com/cloudflare/cloudflare-gcp/blob/add-partition-v2/logpush-to-bigquery/index.js), so we will need a test before a merge can be considered.

FWIW: to my knowledge, this can only be accomplished in BigQuery using "partition decorators" which are described here: https://cloud.google.com/bigquery/docs/creating-partitioned-tables#creating_an_ingestion-time_partitioned_table_when_loading_data

shagamemnon avatar Jun 12 '20 19:06 shagamemnon