extensions
extensions copied to clipboard
bug(firestore-bigquery-export): Partitioning column remains null for valid Firestore Timestamp values
[REQUIRED] Step 2: Describe your configuration
Describe your configuration
Extension name: firestore-bigquery-export
Extension version: firebase/[email protected]
Configuration values (redact info where appropriate):
[REQUIRED] Step 3: Describe the problem
Steps to reproduce:
Steps to reproduce:
- Configure the extension as shown in above image
- Create a document in the collection with Timestamp field, e.g.:
Expected result
Matching BigQuery record has a created_at
column populated with the corresponding Firestore value.
Actual result
Matching BigQuery record has null in created column and logs read ext-orders-partitoned-fsexportbigqueryjm85infkwxc7 Wrong type of Firestore Field for TimePartitioning. Accepts only strings in BigQuery format (DATE, DATETIME, TIMESTAMP) and Firestore Timestamp. Firestore Document field path: projects/<redacted>/databases/(default)/documents/orders/<redacted>. Field name: dateCreated. Field data: [object Object]. Schema field "created_at" value will be null. Wrong type of Firestore Field for TimePartitioning. Accepts only strings in BigQuery format (DATE, DATETIME, TIMESTAMP) and Firestore Timestamp. Firestore Document field path: projects/<redacted>/databases/(default)/documents/<redacted>/<redacted>. Field name: dateCreated. Field data: [object Object]. Schema field "created_at" value will be null.
Looks like it has recurred after being fixed in PR #906
Thanks @githinjikamau
This error seems to suggest the following validation check is failing...
private isValidPartitionTypeDate(value) {
/* Check if valid timestamp value from sdk */
if (value instanceof firebase.firestore.Timestamp) return true;
/* Check if valid date/time value from console */
return Object.prototype.toString.call(value) === "[object Date]";
}
I'll add to our project board for investigation.
Would it be possible to explain how data is currently added to the database?
For example is this added manually through the Firebase console or Firebase sdk?
Hey @dackers86,
Thank you for your quick response. The data is added via the Firebase sdk
Further to this and referring back to my closed issue I raised in May last year I can also confirm that the creation of a partitioned table still doesn't appear to work. I have also attempted to configure the extension in a similar manner to @githinjikamau.
My conclusion is that other than the required configuration options i.e., dataset id, source Firestore collection and destination table prefix none of the subsequent options do anything at all.
I've tried all of them in various combinations and no matter what I set my options to, I always end up with a non partitioned table which seemingly ignores all the settings I've provided?
Here's an extract from the extension's log when attempting to back fill the BigQuery table from Firestore using the script:-
$ npx @firebaseextensions/fs-bq-import-collection
Importing data from Cloud Firestore Collection: sales, to BigQuery Dataset: firestore_stream_v2, Table: sales_by_month_raw_changelog {"severity":"INFO","message":"Creating BigQuery dataset: firestore_stream_v2"} {"severity":"INFO","message":"Created BigQuery dataset: firestore_stream_v2"} {"severity":"INFO","message":"Creating BigQuery table: sales_by_month_raw_changelog"} {"severity":"WARNING","message":"No valid table reference is available. Skipping partitioning"} {"severity":"WARNING","message":"Cannot partition an existing table firestore_stream_v2_sales_by_month_raw_changelog"} {"severity":"WARNING","message":"Cannot partition an existing table firestore_stream_v2_sales_by_month_raw_changelog"} {"severity":"INFO","message":"Clustering removed on sales_by_month_raw_changelog"} {"severity":"INFO","message":"Created BigQuery table: sales_by_month_raw_changelog"} {"severity":"WARNING","message":"No valid table reference is available. Skipping partitioning"} {"severity":"WARNING","message":"Cannot partition an existing table firestore_stream_v2_sales_by_month_raw_latest"}
Our live sales data is continuing to stream without issue, but the table is so large now that it's really starting to take significant time to query it for anything useful. I'm not a Big Query expert by any means, but if we could just get the extension to create a partitioned table by month using the 'created' date in our Firestore 'sales' document (which is at root level and a valid Firestore.Timestamp instance) this would be extremely useful!
If you require any more information that I haven't already provided in by (now closed) issue or this one I'll gladly try and provide it.
One further thing that might be useful from the Firestore Timestamp side of things (not the partitioning problem) is that we use a sentinel value for the 'created' date/time i.e., the server sets this at the time of writing the document so maybe there's something subtly different with this as opposed to @githinjikamau use of the SDK?
Regards
Benj
Thanks @soarb
I'll retest using a server timestamp and post an example, I'll need to lookup what I had tried originally.
Re: the partitioning problem could you create a new issue for discussion? This may also be an existing issue in the repository which I could link any updates too
Brilliant, thanks @dackers86 :) ... I'll create a new issue ... should arrive within the next hour or so, although it will be quite similar to issue 621 I suspect which is closed.
Forgive me if there are any similar open issues I've missed, but the partitioning hasn't worked since I first started using this extension back in May last year :(
New issue created #1059 - please let me know if I can provide any further information.
@dackers86 I can also confirm that all values for my partitioned table's column are null for firestore documents which have valid Firestore Timestamp values.