dbt-sugar
dbt-sugar copied to clipboard
How to assess test coverage quality
When we presented the tool in the dbt office hours Jillian asked a really good question: "How do you measure and derive test coverage?"
Right now we say "if a column has a test (any test) then that column is test covered" So in a model with 10 columns if 5 have a test your coverage is 50%.
Admitedly, that's a bit crude. I think we should be able to have a more granular definition of test coverage and maybe something people can define.
Curious what the community has in mind
Very interesting idea to define custom definitions for the coverage. Some thoughts here,
-
It could be useful to restrict the coverage (or apply different coverage configurations) to different models via dbt selectors. For example, one could be more restrictive to tests in mart models than in staging ones.
-
One could define patterns for column names that must be covered, such as column with suffix
_id
must havenot_null
test, or columns with prefixcat_
must have testaccepted_values
.
Hey @PabloPardoGarcia thanks a lot for chiming in. I think that makes a lot of sense. I think there's some thinking needed on how to orchestrate this. Like where we store those definitions and stuff.
I'm thinking this should go into the sugar_config.yml
parsed here and documented here as a set of arguments and rules.
I'll have to think of how it could look like. If you have an idea in mind im all ears!
The second point you make is something we had already started thinking about with @virvirlopez for tags also. I think this can be a really good start!