elementary icon indicating copy to clipboard operation
elementary copied to clipboard

Parse failed rows as part of the package

Open Maayan-s opened this issue 3 years ago • 6 comments

Failed dbt test results return an error message that includes the number of failed rows. Today we load the message as is in the package, and parse the number in the CLI using Python. If we move the parsing to the package, users could run their analysis on these data.

Maayan-s avatar Aug 03 '22 16:08 Maayan-s

Hello! Would it be also possible to return the number of total failed rows. For now, the error messages includes the number of singular different failed rows but not the total occurrences of rows that are failed. The goal is to track over time the percentage of failures through our tests. Thanks!

aqwvinh avatar Aug 09 '22 12:08 aqwvinh

Copying from Slack for documentation:

The process is usually -

  • forking the repo, in this case the changes are in the package so you'll need to fork this repo
  • Changing the code
  • Opening a PR

We can start by first extracting the failed results number from the message using a regex. We should save this failed_results as a new column failed_results in elementary_test_results. First we need to add this column here. This basically a macro that the elementary_test_results model uses to know what the schema of this table should be. Then we need to extract the failed results from the message in a macro called handle_test_results, around here. This macro responsible for storing all the test results in the elementary_test_results table. To extract it we need to run a simple regex that extracts a number from the message, the result should be saved in a new field in the dict here, called 'failed_results' as this is the same name of the column we created.

Maayan-s avatar Aug 25 '22 15:08 Maayan-s

@oravi I checked the result object as you said, and saw the 'failures' field: message='Got 3100 results, configured to fail if != 0', failures=3100 message='Got 1 result, configured to fail if != 0', failures=1

Do you think it's better to use it?

Maayan-s avatar Aug 28 '22 08:08 Maayan-s

Hi @aqwvinh, we think that this might be a quicker win than the regex. What do you say?

Maayan-s avatar Aug 29 '22 12:08 Maayan-s

Hello! I will try to find some time to work on the PR this week but as a quick answer to your last message, it can be quicker indeed ! But what about if the test passes? Does it return failures=0?

aqwvinh avatar Aug 29 '22 12:08 aqwvinh

Good question, it returns the number of failures. If the test has no special config and passes it returns 0. If the test is configured to fail on failures>100 and there are 5 failures, it will return 5 for a passed test. If there is an error in running the test - it returns None.

(I checked on dbt 1.0.0 and above)

Maayan-s avatar Aug 29 '22 12:08 Maayan-s

Released on 0.4.12 of the dbt pakage! @aqwvinh

Maayan-s avatar Sep 23 '22 17:09 Maayan-s