🐛 [firestore-bigquery-export] Table not partitioned despite using default _PARTITIONTIME pseudo column
[READ] Step 1: Are you in the right place?
Issues filed here should be about bugs for a specific extension in this repository. If you have a general question, need help debugging, or fall into some other category use one of these other channels:
- For general technical questions, post a question on StackOverflow with the firebase tag.
- For general Firebase discussion, use the firebase-talk google group.
- To file a bug against the Firebase Extensions platform, or for an issue affecting multiple extensions, please reach out to Firebase support directly.
[REQUIRED] Step 2: Describe your configuration
- Extension name: firestore-bigquery-export
- Extension version: 0.1.55
- Configuration values (redact info where appropriate):
- BigQuery SQL table Time Partitioning option type: HOUR
- BigQuery Time Partitioning column name: NONE
- Firestore Document field name for BigQuery SQL Time Partitioning field option: NONE
- BigQuery SQL Time Partitioning table schema field(column) type: omit
[REQUIRED] Step 3: Describe the problem
Steps to reproduce:
install the firestore extension with the above settings. Maybe the following code is the problem.
https://github.com/firebase/extensions/blob/next/firestore-bigquery-export/firestore-bigquery-change-tracker/src/bigquery/partitioning.ts#L97
Expected result
An unpartitioned table is generated.
Actual result
A table with _PARTITIONTIME pseudo column is generated.
Because the BigQuery Time Partitioning column name has the following description
BigQuery table column/schema field name for TimePartitioning. You can choose schema available as timestamp OR a new custom defined column that will be assigned to the selected Firestore Document field below. Defaults to pseudo column _PARTITIONTIME if unspecified. Cannot be changed if Table is already partitioned.
I'm also seeing issues with partitioning not working.
If I install a new instance of the extension with the configuration:
- BigQuery SQL table Time Partitioning option type = DAY
- BigQuery Time Partitioning column name (Optional) = [EMPTY]
- Firestore Document field name for BigQuery SQL Time Partitioning field option (Optional) = [EMPTY]
- BigQuery SQL Time Partitioning table schema field(column) type (Optional) = [EMPTY]
I get a table generated that is partitioned by _PARTITIONTIME, but the value of _PARTITIONTIME is always NULL.
If I install I update the instance to create a new dataset with the configuration:
- BigQuery SQL table Time Partitioning option type = DAY
- BigQuery Time Partitioning column name (Optional) =
partition_column - Firestore Document field name for BigQuery SQL Time Partitioning field option (Optional) = [EMPTY]
- BigQuery SQL Time Partitioning table schema field(column) type (Optional) = [EMPTY]
A new table is created, but it isn't partitioned. The column partition_column is not created.
The documentation on configuring clustering through the extension is pretty confusing and would probably benefit from another revision.
Hi @ytetsuro , @dan-massey
Thanks for reporting this issue! We’ve received it and are reviewing it. We’ll provide updates as soon as possible.
A few things from my testing so far:
- Works when I provide the following configuration:
- BigQuery SQL table Time Partitioning option type:
HOUR - BigQuery Time Partitioning column name:
NONE - Firestore Document field name for BigQuery SQL Time Partitioning field option:
NONE - BigQuery SQL Time Partitioning table schema field(column) type:
omit
- I get
NULLfor_PARTITIONTIMEwhen providing the following configuration:
- BigQuery SQL table Time Partitioning option type:
DAY - BigQuery Time Partitioning column name:
NONE - Firestore Document field name for BigQuery SQL Time Partitioning field option:
NONE - BigQuery SQL Time Partitioning table schema field(column) type:
omit
- The table is not partitioned with the following configuration:
- BigQuery SQL table Time Partitioning option type:
HOUR - BigQuery Time Partitioning column name:
partition_column - Firestore Document field name for BigQuery SQL Time Partitioning field option:
NONE - BigQuery SQL Time Partitioning table schema field(column) type:
omit
- The table is not partitioned with the following configuration:
- BigQuery SQL table Time Partitioning option type:
HOUR - BigQuery Time Partitioning column name:
NONE - Firestore Document field name for BigQuery SQL Time Partitioning field option:
time - BigQuery SQL Time Partitioning table schema field(column) type:
omit
- The table is not partitioned with the following configuration:
- BigQuery SQL table Time Partitioning option type:
HOUR - BigQuery Time Partitioning column name:
NONE - Firestore Document field name for BigQuery SQL Time Partitioning field option:
time - BigQuery SQL Time Partitioning table schema field(column) type:
TIMESTAMP
- The table is not partitioned with the following configuration:
- BigQuery SQL table Time Partitioning option type:
NONE - BigQuery Time Partitioning column name:
NONE - Firestore Document field name for BigQuery SQL Time Partitioning field option:
time - BigQuery SQL Time Partitioning table schema field(column) type:
TIMESTAMP
Hi all, just as an update, we have an in-progress PR here to refactor and fix partitioning issues, just a matter of getting this prioritised and merged in, and hopefully will resolve the issues you're having