DataflowTemplates icon indicating copy to clipboard operation
DataflowTemplates copied to clipboard

[Bug]: _metadata_big_query_commit_timestamp is not populated when useStorageWriteApi=true in Spanner to BigQuery template

Open ruslan-yuldashev-sxp opened this issue 8 months ago • 2 comments

Related Template(s)

Spanner_Change_Streams_to_BigQuery

Template Version

Latest version from Google Cloud Console (Flex Template)

What happened?

I'm using the Cloud Spanner change streams to BigQuery template as described in the documentation: https://cloud.google.com/dataflow/docs/guides/templates/provided/cloud-spanner-change-streams-to-bigquery

My pipeline is configured with --useStorageWriteApi=true.

The documentation mentions the following:

_metadata_big_query_commit_timestamp: The commit timestamp when the row is inserted into BigQuery. If useStorageWriteApi is true, this column is not automatically created in the changelog table by the pipeline. In that case, you must manually add this column in the changelog table if needed.

  • I manually added the _metadata_big_query_commit_timestamp column to the changelog table schema (as a TIMESTAMP).
  • The pipeline is running correctly and writing rows to BigQuery.
  • However, the _metadata_big_query_commit_timestamp field is never populated — it remains NULL in all rows.

🔍 Question:

Is this field supposed to be automatically populated by the template or by BigQuery Storage Write API when useStorageWriteApi=true?

If not — what is the recommended way to populate this value?

Thank you!

Relevant log output


ruslan-yuldashev-sxp avatar Apr 25 '25 16:04 ruslan-yuldashev-sxp

@ruslan-yuldashev-sxp , when using --useStorageWriteApi=true , we need to manually add the _metadata_big_query_commit_timestamp column and set CURRENT_TIMESTAMP as its default value if we need the column.

So, if you have already created column _metadata_big_query_commit_timestamp manually, what you can further do is :

ALTER TABLE {PROJECT_ID}.{DATASET_ID}.{CHANGELOG_TABLE} ALTER COLUMN _metadata_big_query_commit_timestamp SET DEFAULT CURRENT_TIMESTAMP()

After this, _metadata_big_query_commit_timestamp should be populated. Can you please confirm the same ?

I'll check if we can improve public documentation.

TanuSharma2511 avatar May 06 '25 07:05 TanuSharma2511

Tanu, please check this. Thanks.

liferoad avatar Jun 22 '25 14:06 liferoad