dbt-core icon indicating copy to clipboard operation
dbt-core copied to clipboard

[CT-846] Detailed `timing` info not collected for models with `error` status

Open jtcohen6 opened this issue 2 years ago • 1 comments

The timing array in run_results.json makes it possible to see the exact timestamps when a model started/finished compiling and executing. The array appears to be entirely empty if the resource encounters a runtime error.

What users see in run_results.json, for an error model vs. a success model:

"results": [
    {
      "status": "error",
      "timing": [],
      "thread_id": "Thread-1",
      "execution_time": 0.1041419506072998,
      "adapter_response": {},
      "message": "Database Error in model failing_model (models/failing_model.sql)\n  column \"abcdef\" does not exist\n  LINE 4:     select abcdef\n                     ^\n  compiled SQL at target/run/testy/models/failing_model.sql",
      "failures": null,
      "unique_id": "model.testy.failing_model"
    },
    {
      "status": "success",
      "timing": [
        {
          "name": "compile",
          "started_at": "2022-07-15T10:01:00.404848Z",
          "completed_at": "2022-07-15T10:01:00.410078Z"
        },
        {
          "name": "execute",
          "started_at": "2022-07-15T10:01:00.436724Z",
          "completed_at": "2022-07-15T10:01:00.530546Z"
        }
      ],
      "thread_id": "Thread-2",
      "execution_time": 0.12770366668701172,
      "adapter_response": {
        "_message": "CREATE VIEW",
        "code": "CREATE VIEW",
        "rows_affected": -1
      },
      "message": "CREATE VIEW",
      "failures": null,
      "unique_id": "model.testy.success_model"
    }
  ],

The relevant code is here, in the compile_and_execute method:

https://github.com/dbt-labs/dbt-core/blob/f988f76fccc1878aaf8d8631c05be3e9104b3b9a/core/dbt/task/base.py#L336-L340

The collect_timing_info context manager updates the timing_info attribute on self. However, it never appends timing_info above if self.run encounters an error (e.g. a database error):

ipdb> result = self.run(ctx.node, manifest)
*** dbt.exceptions.DatabaseException: Database Error
  column "aposhfaosdfj" does not exist
  LINE 4:     select aposhfaosdfj
                     ^

The resolution to this bug would find a way to return/append/update the timing_info array even when the model encounters an error during its build.

jtcohen6 avatar Jul 15 '22 10:07 jtcohen6

Detailed timing information (including timestamps) would be available from the structured logging interface, which is where we're investing more of our effort around dbt Core metadata going forward

jtcohen6 avatar Jul 26 '22 17:07 jtcohen6

Related: https://github.com/dbt-labs/dbt-snowflake/issues/82#issuecomment-1315392698

Even though the response object is available from failing queries, because of the way we try materialization macros and bubble up any exceptions, we don't plumb the adapter_response back up to the run result object.

jtcohen6 avatar Nov 15 '22 17:11 jtcohen6