cloudflare-gcp
cloudflare-gcp copied to clipboard
Add support for time partitioning and expiry
The current behaviour is to load everything into a non-partitioned table. That means that queries will scan the entire table every time.
In order to make queries cheaper, we can use time partitioning.
One nice side-effect of this is that we also get the ability to configure expiration.
Note: I could use some help testing out this patch.
Hey @igorwwwwwwwwwwwwwwwwwwww thanks for taking a stab at this. In the past, creating ingestion-time partitioning for logs inserted via a load job has been non-trivial. The solution in your PR has not worked in the past (see https://github.com/cloudflare/cloudflare-gcp/blob/add-partition-v2/logpush-to-bigquery/index.js), so we will need a test before a merge can be considered.
FWIW: to my knowledge, this can only be accomplished in BigQuery using "partition decorators" which are described here: https://cloud.google.com/bigquery/docs/creating-partitioned-tables#creating_an_ingestion-time_partitioned_table_when_loading_data