ethereum-etl-airflow icon indicating copy to clipboard operation
ethereum-etl-airflow copied to clipboard

Add clustering to BigQuery tables where appropriate

Open medvedev1088 opened this issue 4 years ago • 2 comments

https://cloud.google.com/bigquery/docs/clustered-tables. It can make some queries cheaper and faster.

medvedev1088 avatar Nov 01 '19 14:11 medvedev1088

I was thinking of this when reading @askeluv 's blog post: https://towardsdatascience.com/how-to-get-any-ethereum-smart-contract-into-bigquery-in-8-mins-bab5db1fdeee If we implement clustering/partitioning, it should be possible to reduce the query cost of using the logs table directly, i.e. no need to create contract-specific tables.

However, there is some active work being done for streaming into partitions, see: https://issuetracker.google.com/issues/35905817#comment89

allenday avatar May 18 '20 03:05 allenday

That's a good idea.

medvedev1088 avatar May 18 '20 05:05 medvedev1088