extensions icon indicating copy to clipboard operation
extensions copied to clipboard

🐛 [firestore-bigquery-export] Table not partitioned despite using default _PARTITIONTIME pseudo column

Open ytetsuro opened this issue 1 year ago • 4 comments

[READ] Step 1: Are you in the right place?

Issues filed here should be about bugs for a specific extension in this repository. If you have a general question, need help debugging, or fall into some other category use one of these other channels:

  • For general technical questions, post a question on StackOverflow with the firebase tag.
  • For general Firebase discussion, use the firebase-talk google group.
  • To file a bug against the Firebase Extensions platform, or for an issue affecting multiple extensions, please reach out to Firebase support directly.

[REQUIRED] Step 2: Describe your configuration

  • Extension name: firestore-bigquery-export
  • Extension version: 0.1.55
  • Configuration values (redact info where appropriate):
    • BigQuery SQL table Time Partitioning option type: HOUR
    • BigQuery Time Partitioning column name: NONE
    • Firestore Document field name for BigQuery SQL Time Partitioning field option: NONE
    • BigQuery SQL Time Partitioning table schema field(column) type: omit

[REQUIRED] Step 3: Describe the problem

Steps to reproduce:

install the firestore extension with the above settings. Maybe the following code is the problem.

https://github.com/firebase/extensions/blob/next/firestore-bigquery-export/firestore-bigquery-change-tracker/src/bigquery/partitioning.ts#L97

Expected result

An unpartitioned table is generated.

Actual result

A table with _PARTITIONTIME pseudo column is generated.

Because the BigQuery Time Partitioning column name has the following description

BigQuery table column/schema field name for TimePartitioning. You can choose schema available as timestamp OR a new custom defined column that will be assigned to the selected Firestore Document field below. Defaults to pseudo column _PARTITIONTIME if unspecified. Cannot be changed if Table is already partitioned.

ytetsuro avatar Oct 17 '24 11:10 ytetsuro

I'm also seeing issues with partitioning not working.

If I install a new instance of the extension with the configuration:

  • BigQuery SQL table Time Partitioning option type = DAY
  • BigQuery Time Partitioning column name (Optional) = [EMPTY]
  • Firestore Document field name for BigQuery SQL Time Partitioning field option (Optional) = [EMPTY]
  • BigQuery SQL Time Partitioning table schema field(column) type (Optional) = [EMPTY]

I get a table generated that is partitioned by _PARTITIONTIME, but the value of _PARTITIONTIME is always NULL.

If I install I update the instance to create a new dataset with the configuration:

  • BigQuery SQL table Time Partitioning option type = DAY
  • BigQuery Time Partitioning column name (Optional) = partition_column
  • Firestore Document field name for BigQuery SQL Time Partitioning field option (Optional) = [EMPTY]
  • BigQuery SQL Time Partitioning table schema field(column) type (Optional) = [EMPTY]

A new table is created, but it isn't partitioned. The column partition_column is not created.

The documentation on configuring clustering through the extension is pretty confusing and would probably benefit from another revision.

dan-massey avatar Oct 30 '24 22:10 dan-massey

Hi @ytetsuro , @dan-massey

Thanks for reporting this issue! We’ve received it and are reviewing it. We’ll provide updates as soon as possible.

cabljac avatar Feb 11 '25 11:02 cabljac

A few things from my testing so far:

  1. Works when I provide the following configuration:
  • BigQuery SQL table Time Partitioning option type: HOUR
  • BigQuery Time Partitioning column name: NONE
  • Firestore Document field name for BigQuery SQL Time Partitioning field option: NONE
  • BigQuery SQL Time Partitioning table schema field(column) type: omit
  1. I get NULL for _PARTITIONTIME when providing the following configuration:
  • BigQuery SQL table Time Partitioning option type: DAY
  • BigQuery Time Partitioning column name: NONE
  • Firestore Document field name for BigQuery SQL Time Partitioning field option: NONE
  • BigQuery SQL Time Partitioning table schema field(column) type: omit
  1. The table is not partitioned with the following configuration:
  • BigQuery SQL table Time Partitioning option type: HOUR
  • BigQuery Time Partitioning column name: partition_column
  • Firestore Document field name for BigQuery SQL Time Partitioning field option: NONE
  • BigQuery SQL Time Partitioning table schema field(column) type: omit
  1. The table is not partitioned with the following configuration:
  • BigQuery SQL table Time Partitioning option type: HOUR
  • BigQuery Time Partitioning column name: NONE
  • Firestore Document field name for BigQuery SQL Time Partitioning field option: time
  • BigQuery SQL Time Partitioning table schema field(column) type: omit
  1. The table is not partitioned with the following configuration:
  • BigQuery SQL table Time Partitioning option type: HOUR
  • BigQuery Time Partitioning column name: NONE
  • Firestore Document field name for BigQuery SQL Time Partitioning field option: time
  • BigQuery SQL Time Partitioning table schema field(column) type: TIMESTAMP
  1. The table is not partitioned with the following configuration:
  • BigQuery SQL table Time Partitioning option type: NONE
  • BigQuery Time Partitioning column name: NONE
  • Firestore Document field name for BigQuery SQL Time Partitioning field option: time
  • BigQuery SQL Time Partitioning table schema field(column) type: TIMESTAMP

CorieW avatar May 22 '25 13:05 CorieW

Hi all, just as an update, we have an in-progress PR here to refactor and fix partitioning issues, just a matter of getting this prioritised and merged in, and hopefully will resolve the issues you're having

cabljac avatar Sep 11 '25 07:09 cabljac