[firestore-bigquery-export] Create table with latest data
Hi,
This feature request is for extension: firestore-bigquery-export
What feature would you like to see?
I would like to be able to create a table with the latest data exploiting data schema. I have seen the script (https://github.com/firebase/extensions/blob/9f8d7fd6048bcaa7b5bc505cebe2e90494359e33/firestore-bigquery-export/guides/GENERATE_SCHEMA_VIEWS.md ) to generate the latest view but a view does not fit my need for the following reasons:
- By using a view, i am reading all the data (even if I can optimize by using partitionning) and it will increase the bigquery usage cost
- By using a view, I am not able to leverage materialized view capacities (impossible to create materialized view of a view) in order to aggregate data at a higher level and optimize queries in term of performances & costs
Is it something already planned ? What do you think of it ?
If it's not planned or possible to include in a short term, what would be your advice to do it ?
- Create a bigquery scheduled query runned regularly in order to insert new latest data in another table (and deduplicate in order to keep only latest version)
- Copy the code of your cloud functions, adapt it to fit this need ?
Thanks for your help, Best regards
I would like to see this as well. We have a collection that has a lot of data turnover, and so the changelog table is HUGE. Doing any sort of operation on the "schema latest" view is slow and very costly.
It might be sufficient to add periodic data pruning (purge all BQ rows for documents that have been deleted).
Possibly related to #1608.
We will raise this with the team as a potential new feature.