extensions icon indicating copy to clipboard operation
extensions copied to clipboard

🐛 [firestore-bigquery-export] Wrong reference path passed to firestore doc

Open YounesAmalou opened this issue 7 months ago • 1 comments

Describe your configuration

  • Extension name: firestore-bigquery-export
  • Extension version: 0.1.24
  • Configuration values (redact info where appropriate):
    • Firebase project ID: mtp-dev-001
    • BigQuery project ID: mtp-dev-001
    • Firestore collection path: Users/{userid}/Entries
    • Use Collection Group query: Yes
    • BigQuery dataset ID: firestore_export
    • BigQuery table prefix: userentries
    • Documents per import batch: 300
    • BigQuery dataset location: us
    • Use multithreaded import: Yes
    • Use optimized snapshot query: Yes
    • Transform function URL: (None)
    • Use local Firestore emulator: No
    • Failed import output location: (None)

Describe the problem

Steps to reproduce:

I had pre-existing collections in Firestore, and while trying to import them using GCP Shell, I faced the error down below, I tried with different settings, and the same error keeps occurring. The error happens for each document, and no output is resolved.

Expected result

The documents should be imported into the destinated BigQuery table.

Actual result
{"severity":"INFO","message":"BigQuery dataset already exists: firestore_export"}
{"severity":"WARNING","message":"Did not add partitioning to schema: Partitioning not enabled"}
{"severity":"INFO","message":"Clustering removed on userentries_raw_changelog"}
{"severity":"INFO","message":"Created BigQuery table: userentries_raw_changelog"}
{"severity":"WARNING","message":"Error caught creating table Provided Schema does not match Table mtp-dev-001:firestore_export.userentries_raw_latest. Field path_params is missing in new schema"}
Wait a few seconds for the dataset to initialize...
Importing data from Cloud Firestore Collection (via a Collection Group query): Users/{userid}/Entries, to BigQuery Dataset: firestore_export, Table: userentries_raw_changelog
(node:4445) AutopaginateTrueWarning: Autopaginate will always be set to false in stream paging methods. See more info at https://github.com/googleapis/gax-nodejs/blob/main/client-libraries.md#auto-pagination for more information on how to configure paging calls
(Use `node --trace-warnings ...` to show where the warning was created)
An error has occurred on the following documents, please re-run or insert the following query documents manually... {"endAt":{"before":true,"values":[{"referenceValue":"projects/mtp-dev-001/databases/(default)/documents/Users/0HP7GEgbOXU7hk29x6GJhWJt73z2/Entries/Zqi8SsjVmHLOVQnNxiON","valueType":"referenceValue"}]}}
Error: Value for argument "documentPath" must point to a document, but was "projects/mtp-dev-001/databases/(default)/documents/Users/0HP7GEgbOXU7hk29x6GJhWJt73z2/Entries/Zqi8SsjVmHLOVQnNxiON". Your path does not contain an even number of components.
    at Firestore.doc (/home/mygcpuser/node_modules/@google-cloud/firestore/build/src/index.js:702:19)
    at AsyncFunction.processDocuments (/home/mygcpuser/node_modules/@firebaseextensions/fs-bq-import-collection/lib/worker.js:41:54)
    at MessagePort.<anonymous> (/home/mygcpuser/node_modules/workerpool/src/worker.js:157:27)
    at [nodejs.internal.kHybridDispatch] (node:internal/event_target:827:20)
    at MessagePort.<anonymous> (node:internal/per_context/messageport:23:28)
An error has occurred on the following documents, please re-run or insert the following query documents manually... {"startAt":{"before":true,"values":[{"referenceValue":"projects/mtp-dev-001/databases/(default)/documents/Users/0HP7GEgbOXU7hk29x6GJhWJt73z2/Entries/Zqi8SsjVmHLOVQnNxiON","valueType":"referenceValue"}]},"endAt":{"before":true,"values":[{"referenceValue":"projects/mtp-dev-001/databases/(default)/documents/Users/1600XwSJ4eeVkBigm1nS/Entries/nfUTcslwh0nFQuN6RbmQ","valueType":"referenceValue"}]}}
Error: Value for argument "documentPath" must point to a document, but was "projects/mtp-dev-001/databases/(default)/documents/Users/0HP7GEgbOXU7hk29x6GJhWJt73z2/Entries/Zqi8SsjVmHLOVQnNxiON". Your path does not contain an even number of components.
    at Firestore.doc (/home/mygcpuser/node_modules/@google-cloud/firestore/build/src/index.js:702:19)
    at AsyncFunction.processDocuments (/home/mygcpuser/node_modules/@firebaseextensions/fs-bq-import-collection/lib/worker.js:37:52)
    at MessagePort.<anonymous> (/home/mygcpuser/node_modules/workerpool/src/worker.js:157:27)
    at [nodejs.internal.kHybridDispatch] (node:internal/event_target:827:20)
    at MessagePort.<anonymous> (node:internal/per_context/messageport:23:28)
An error has occurred on the following documents, please re-run or insert the following query documents manually... {"startAt":{"before":true,"values":[{"referenceValue":"projects/mtp-dev-001/databases/(default)/documents/Users/1600XwSJ4eeVkBigm1nS/Entries/nfUTcslwh0nFQuN6RbmQ","valueType":"referenceValue"}]},"endAt":{"before":true,"values":[{"referenceValue":"projects/mtp-dev-001/databases/(default)/documents/Users/1u9YyfUpVKudATpd8ql1/Entries/4WiwAWsaC14lRYnoXAMC","valueType":"referenceValue"}]}}
Error: Value for argument "documentPath" must point to a document, but was "projects/mtp-dev-001/databases/(default)/documents/Users/1600XwSJ4eeVkBigm1nS/Entries/nfUTcslwh0nFQuN6RbmQ". Your path does not contain an even number of components.
    at Firestore.doc (/home/mygcpuser/node_modules/@google-cloud/firestore/build/src/index.js:702:19)
    at AsyncFunction.processDocuments (/home/mygcpuser/node_modules/@firebaseextensions/fs-bq-import-collection/lib/worker.js:37:52)
    at MessagePort.<anonymous> (/home/mygcpuser/node_modules/workerpool/src/worker.js:157:27)
    at [nodejs.internal.kHybridDispatch] (node:internal/event_target:827:20)
    at MessagePort.<anonymous> (node:internal/per_context/messageport:23:28)
An error has occurred on the following documents, please re-run or insert the following query documents manually... {"startAt":{"before":true,"values":[{"referenceValue":"projects/mtp-dev-001/databases/(default)/documents/Users/1u9YyfUpVKudATpd8ql1/Entries/4WiwAWsaC14lRYnoXAMC","valueType":"referenceValue"}]},"endAt":{"before":true,"values":[{"referenceValue":"projects/mtp-dev-001/databases/(default)/documents/Users/1u9YyfUpVKudATpd8ql1/Entries/VsCbbUyQ2Reh7RLYenWG","valueType":"referenceValue"}]}}
Error: Value for argument "documentPath" must point to a document, but was "projects/mtp-dev-001/databases/(default)/documents/Users/1u9YyfUpVKudATpd8ql1/Entries/4WiwAWsaC14lRYnoXAMC". Your path does not contain an even number of components.
    at Firestore.doc (/home/mygcpuser/node_modules/@google-cloud/firestore/build/src/index.js:702:19)
    at AsyncFunction.processDocuments (/home/mygcpuser/node_modules/@firebaseextensions/fs-bq-import-collection/lib/worker.js:37:52)
    at MessagePort.<anonymous> (/home/mygcpuser/node_modules/workerpool/src/worker.js:157:27)
    at [nodejs.internal.kHybridDispatch] (node:internal/event_target:827:20)
    at MessagePort.<anonymous> (node:internal/per_context/messageport:23:28)

YounesAmalou avatar May 23 '25 16:05 YounesAmalou

Hi, i'm reviewing your PR - it's strange though i wasn't able to reproduce this issue. I will continue to try to repro and keep you updated.

cabljac avatar Jun 10 '25 15:06 cabljac

Hello! Any news on this issue? I seem to be experiencing the same. I've got the same structure as @YounesAmalou and the only input I got different is the use of optimized snapshot query script. Will appreciate any update, thanks!!

angelabhouse avatar Jul 02 '25 15:07 angelabhouse

I've noticed that the referenceValue in the serializableQuery has a full path reference that looks similarly to

projects/{project_id}/databases/{database_id}/documents/{document_path}

I couldn't find the initiation of this value nor the documentation for it besides this reference (referenceValue) but for the doc(path) method, the path param looks similarly to the {document_path} found in the full path.

For testing, I have directly applied this script inside the Google Cloud Shell and followed the steps according to the documentation. I tried to run the script the first time and the error showed up, then I tried to patch directly the dist build code for the script and it worked! Then I needed the script for another collection (this time not a sub-collection) and it showed the same error until I patched it to keep going with my task.

YounesAmalou avatar Jul 06 '25 10:07 YounesAmalou

Thanks for the detailed explanation on the PR, @YounesAmalou — I tried the patch and it finally worked!

However, I’m now running into a different issue: not all user documents are being imported into BigQuery. Have you experienced something similar? I’ve checked the logs but haven’t been able to find any indication of a limit on the number of documents or any related restriction. Any insight would be greatly appreciated!

angelabhouse avatar Jul 09 '25 13:07 angelabhouse

@angelabhouse sorry to hear that, I haven't crossed this issue before.

However, for reassurance, could you how did you find that the documents don't match?

What I did on my side is:

  1. Running a query on BigQuery to COUNT(1) the latest table.
  2. Inside the Query Builder from the Firebase interface, I retrieved the count of the documents with the same path of the query.
  3. Compared them and found them matching.

Also, you mentioned that you have tried the different optimized snapshot query script on the script params, did you try the other method?

Otherwise, I would suggest creating an issue.

Meanwhile, if you want to continue debugging to find the root cause, then try to find the loop and check its length to figure whether the problem might be caused by the query or something related to the environment.

YounesAmalou avatar Jul 09 '25 23:07 YounesAmalou

I have the same issue. The script fail to import subcollections

Hinten avatar Jul 30 '25 20:07 Hinten

Just bumping this because I ran into this issue, and verified that the fix proposed in https://github.com/firebase/extensions/pull/2437 solves the problem.

jketcham avatar Aug 14 '25 18:08 jketcham

Hi all, i've flagged this for more investigation, and will provide updates when available!

cabljac avatar Aug 26 '25 16:08 cabljac

I believe I may have accidentally released a fix for this, forgetting that this PR was opened. I will review to confirm. If so, i'll make sure that @YounesAmalou is properly accredited for their contribution.

cabljac avatar Aug 26 '25 16:08 cabljac

To confirm version 0.1.26 of @firebaseextensions/fs-bq-import-collection no longer has this issue.

jketcham avatar Sep 19 '25 22:09 jketcham