dataform icon indicating copy to clipboard operation
dataform copied to clipboard

Bug - OnSchemaChange doesnt work as expected in Bigquery Dataform

Open RashedMahjna opened this issue 11 months ago • 10 comments

Steps Taken to Address the Issue: 1)Core Version Update: Upgraded the core version to 3.0.12 in workflow_settings.yaml.

2)Package Installation: Installed necessary dependencies.

3)Initial SQLX Configuration: Created a SQLX file with the following configuration:

config {
    type: "incremental",
    schema: "test",
    onSchemaChange: "EXTEND",
}
select 1 as a, 2 as b, "asd" as c;

Column Addition: Introduced a new column d to the SQLX configuration:

config {
    type: "incremental",
    schema: "test",
    onSchemaChange: "EXTEND",
}
select 1 as a, 2 as b, "asd" as c, "test" as d;

Expected vs. Actual Outcome:

  • Expected Behavior: The schema should recognize the newly added column d and automatically update the table structure.

  • Actual Behavior: the table schema was not updated accordingly.

  • Additional Issue: When adding uniqueKey, and dropped a column ,an error occurred Query error: Name c not found inside S

Root Cause Analysis & Suggested Fixes:

  • The current logic retrieves column names from INFORMATION_SCHEMA.COLUMNS, which does not immediately reflect newly added columns / removed columns.

RashedMahjna avatar Feb 03 '25 06:02 RashedMahjna

This feature is not yet ready for use unfortunately. We'll do a release note when this feature is properly available.

GJMcGowan avatar Feb 03 '25 14:02 GJMcGowan

@GJMcGowan I am using dataform 3.0.23 in bigquery and this issue still exists. Is there anything I need to do in this case to make it work?

ghost avatar Jun 13 '25 16:06 ghost

I am using dataform 3.0.23 in bigquery and this issue still exists. Is there anything I need to do in this case to make it work?

This feature is still not released officially, so you can't use it yet.

kolina avatar Jun 15 '25 10:06 kolina

Hello @kolina , thank you for your response.

For anyone looking in the future: the official BigQuery Dataform release is currently at version 3.0.0. You can find the release notes here: https://cloud.google.com/dataform/docs/release-notes

ghost avatar Jun 18 '25 07:06 ghost

For anyone looking in the future: the official BigQuery Dataform release is currently at version 3.0.0. You can find the release notes here: https://cloud.google.com/dataform/docs/release-notes

This note was about releasing the v3 version of @dataform/core, you can still use 2.x or 3.x versions depending on what you set in your Dataform project.

Support of managed incremental schema updates in GCP Dataform requires changes to @dataform/core (already released) and changes in the GCP Dataform execution engine (hasn't been officially released yet, WIP).

kolina avatar Jun 19 '25 13:06 kolina

Thank you for clarifying this. How can I know which version I can use in my dataform project? For example, I used to have it as 3.0.7 and changed it to 3.0.23 to test the support of incremental schema and the workflow worked normally but without this feature. It didn't say this release is not supported for example.

ghost avatar Jun 19 '25 14:06 ghost

Thank you for clarifying this. How can I know which version I can use in my dataform project? For example, I used to have it as 3.0.7 and changed it to 3.0.23 to test the support of incremental schema and the workflow worked normally but without this feature. It didn't say this release is not supported for example.

After we officially launch support in our GCP execution engine, you'll be able to use it with existing @dataform/core versions where its configuration is supported.

I think it's a good call to throw an error for now until it's supported in our API.

kolina avatar Jun 20 '25 12:06 kolina

@Tuseeq1, can you add returning an error in the Dataform API for now when someone tries to use configuration for incremental schema updates?

kolina avatar Jun 20 '25 12:06 kolina

I will add the error.

UPD: See #1991

Ceridan avatar Jun 27 '25 09:06 Ceridan

After the discussion in a PR we agreed to try to find a simple solution on the backend side to make it turn it to an error for all the customers regardless of Dataform Core version.

Ceridan avatar Jul 02 '25 09:07 Ceridan