Uploading run results fails if model is skipped (BigQuery)

Open lari opened this issue 1 year ago • 0 comments

Describe the bug When a dbt run skips a BigQuery materialized view model, the Elementary package fails with an error Value has type STRING which cannot be inserted into column rows_affected, which has type INT64.

Looking at the SQL job in BigQuery console, I have identified the issue to be that Elementary tries to insert a value '-1' (with quotes) to the rows_affected column, which is INT64.

I have further identified that the value '-1' is not identified as number in the insert_rows macro on line {%- if value is number -%} here: https://github.com/elementary-data/dbt-data-reliability/blob/0.16.1/macros/utils/table_operations/insert_rows.sql#L191

Changing the line to {%- if value is number or value == '-1' -%} would fix the issue.

However, there's a question of how to store "skipped" run results at all?

To Reproduce Steps to reproduce the behavior:

Create a materialized view with on_configuration_change = 'continue' config
dbt run to create the view
Make a schema change in the materialized view
dbt run

Expected behavior The dbt project and Elementary package should run without errors.

Screenshots

Error message:

on-run-end failed, error:
Value has type STRING which cannot be inserted into column rows_affected, which has type INT64 at

Environment (please complete the following information):

Elementary CLI (edr) version: n/a
Elementary dbt package version: 0.16.1
dbt version you're using: 1.8.7
Data warehouse: bigquery (1.8.3)
Infrastructure details: Dev Container based on python:3.11-slim-bullseye

Additional context

Here's the run_results.json from the run. As you can see, the rows_affected is set to "-1".

    "metadata": {
        "dbt_schema_version": "https://schemas.getdbt.com/dbt/run-results/v6.json",
        "dbt_version": "1.8.7",
        "generated_at": "2024-11-13T06:58:45.324884Z",
        "invocation_id": "...",
        "env": {}
    },
    "results": [
        {
            "status": "success",
            "timing": [
                {
                    "name": "compile",
                    "started_at": "2024-11-13T06:58:39.217536Z",
                    "completed_at": "2024-11-13T06:58:39.245089Z"
                },
                {
                    "name": "execute",
                    "started_at": "2024-11-13T06:58:39.245704Z",
                    "completed_at": "2024-11-13T06:58:39.630172Z"
                }
            ],
            "thread_id": "Thread-1 (worker)",
            "execution_time": 0.41369032859802246,
            "adapter_response": {
                "_message": "skip `project`.`dataset`.`table`",
                "code": "skip",
                "rows_affected": "-1"
            },
            "message": "skip `project`.`dataset`.`table`",
            "failures": null,
            "unique_id": "model.model_name",
            "compiled": true,
            "compiled_code": "...",
            "relation_name": "`project`.`dataset`.`table`"
        }
    ],
    "elapsed_time": 8.028469562530518,
    "args": {
        ...
    }
}

Would you be willing to contribute a fix for this issue?

Possibly yes, but there are design decision needed on how to handle run results for skipped models.

Nov 13 '24 07:11 lari