extensions icon indicating copy to clipboard operation
extensions copied to clipboard

🐛 [firestore-export-bigquery] Error initializing BigQuery resources when running fs-bq-import-collection

Open gjanvier opened this issue 5 months ago • 3 comments

  • Extension name: firestore-bigquery-export
  • Extension version: 0.2.5
  • Configuration values (redact info where appropriate):
    • VIEW_TYPE = "materialized_incremental"
    • CLUSTERING = "timestamp"

Yet the issue is rather in the import script from that repo, bundeled in npm package @firebaseextensions/[email protected].

[REQUIRED] Step 3: Describe the problem

Steps to reproduce:

Deploy the extension firestore-bigquery-export using the view type materialized_incremental recently introduced. Bonus: also specify "timestamp" as clustering field.

Then run the backfill command using this command:

npx @firebaseextensions/fs-bq-import-collection \
  --non-interactive \
  --project=xxx \
  --big-query-project=xxx \
  --source-collection-path=my_collection \
  --query-collection-group=false \
  --dataset=firestore_export \
  --dataset-location=europe-west1 \
  --table-name-prefix=my_collection
Expected result

I expect the command not to update my existing BigQuey tables / views, but simply to INSERT new rows with any document data found in my Firestore collection.

Actual result

The script drops my clustering, for no reason, it would also have altered my partitionning if I had one.

The finally script crashes with this error:

{"severity":"INFO","message":"BigQuery dataset already exists: firestore_export"}
{"severity":"INFO","message":"Clustering removed on my_collection_raw_changelog"}
{"severity":"WARNING","message":"Did not add partitioning to schema: Partitioning not enabled"}
{"severity":"INFO","message":"Updated Metadata on my_collection_raw_changelog, {\"config\":{\"tableId\":\"my_collection\",\"datasetId\":\"firestore_export\",\"datasetLocation\":\"europe-west1\",\"wildcardIds\":false,\"useNewSnapshotQuerySyntax\":false,\"bqProjectId\":\"xxx\",\"firestoreInstanceId\":\"(default)\"},\"documentIdColExists\":{\"name\":\"document_id\",\"type\":\"STRING\",\"mode\":\"NULLABLE\",\"description\":\"The document id as defined in the firestore database.\"},\"oldDataColExists\":{\"name\":\"old_data\",\"type\":\"STRING\",\"mode\":\"NULLABLE\",\"description\":\"The full JSON representation of the document state before the indicated operation is applied. This field will be null for CREATE operations.\"}})"}
{"severity":"INFO","message":"View with id my_collection_raw_latest already exists in dataset firestore_export."}
Error initializing BigQuery resources:  Error initializing latest view: Cannot set a view definition for xxx:firestore_export.my_collection_raw_latest because it is not of type View.
Error importing Collection to BigQuery: Error: Error initializing latest view: Cannot set a view definition for xxx:firestore_export.my_collection_raw_latest because it is not of type View.

gjanvier avatar Jul 11 '25 12:07 gjanvier

I created https://github.com/firebase/extensions/pull/2469 to solve this issue with an optional argument --skip-init when running the backfill command with CLI and @firebaseextensions/fs-bq-import-collection.

To me, that --skip-init option should be set by default, and user may add --with-init to explicitly run the BigQuery initialization before the backfill. Yet I implemented it this way to avoid any breaking change...

gjanvier avatar Jul 11 '25 12:07 gjanvier

Hi there, thanks for raising this! Appreciate that you've opened a PR as well to provide a workaround.

I will raise this with the team to discuss the best path forward, and provide you with updates ASAP

cabljac avatar Jul 15 '25 16:07 cabljac

Thanks for raising / looking into this. I am also having this issue - after just deploying many extensions via manifest, the import script is removing clustering on all tables.

mothman11 avatar Jul 29 '25 07:07 mothman11