dbt-project-evaluator
dbt-project-evaluator copied to clipboard
New Rules! for model governance (dbt Core v1.5)
Describe the feature
https://dbt-labs.github.io/dbt-project-evaluator/0.5/rules/
I'd like to propose some New Rules, related to the model governance features that are new in v1.5.
-
Public Models Have Contracts: Every model with
access: public
also hascontract: {enforced: true}
-
Public Models Are Documented: Every model with
access: public
has a top-leveldescription
anddescription
for every column (this rule, but more) - Column Naming Convention: For columns in models with explicitly defined contracts, we can assert that columns' names and data types match our opinionated recommendations.
-
Exposures Public Parents: Exposures depend only on models with
access: public
(similar to this rule) -
At Most Three Versions: There should be, at most, three versions of a given model: one
latest
, oneold
, oneprerelease
-
Group Coverage: Calculate percentage of resources in
groups
. Groupable resources are everything exceptsources
+exposures
. Raise a warning about resources that aren't in groups, with an owner. (This will be a heavier lift at first; we should build tooling to make it easier!) -
Model Access Ratio: Calculate ratios of model
access
: private, protected, public. My absolutely arbitrary guess at an ideal breakdown would be somewhere in the ballpark of 50/40/10%. Everygroup
should have at least oneprivate
model.
Related to deprecation_date
(https://github.com/dbt-labs/dbt-core/issues/7433), coming in v1.6:
- All
old
model versions (<latest_version
) must have adeprecation_date
- All models with a
deprecation_date
earlier than today should be removed
Very open to thoughts & feedback on all of the above — and more ideas, of course :)
Who will this benefit?
Developers & maintainers of large & complex projects, who want to start taking advantage of v1.5 features for model governance
Are you interested in contributing this feature?
I can help with refining/brainstorming, &/or possibly contribute with some of my hacktime :)
Love all of these ideas and am excited to address new best practices for model contracts. Do you think these best practices fall into one of our pre-existing categories (modeling, testing, documentation, structure, performance) or does this call for a new category?
I think some of these fall neatly into modeling (such as Exposures Public Parents) - but for the others, I'm thinking about creating a new category for "governance".
In my mind, most of them would fall on a new Governance category. Mostly because contracts/governance is quite opt-in and it would be great for people to activate/deactivate Governance rules if they want or not.
The one that I am less sure about is "Column Naming Convention". While it only really works when governance is in place and all columns are listed, this feels more about modelling that Governance to me.
loving this list of new no nos:
we're gonna tackle this, likely in this priority order:
Definitely:
- Public Models Have Contracts: for sure!
- Public Models Are Documented: of course! - column level checks will be a slightly new construct here, but doing a model-level check for all columns have descriptions should be doable!
- Exposures Public Parents: deal! organzing all of these into a
Governance
folder will be key here, since those who don't adopt mesh stuff might get some noise on existing exposures
Maybe:
- Group Coverage: We're on board here, but given how little we know about how people will use this, not sure we have a clear opinionated stance on whether you should group everything
- At Most Three Versions: Similar to above -- likely a good check, but until we see folks using versions, may be hard to properly capture all the use cases in a single check
- Column Naming Convention: this one seems hard as hell! column naming is something that we have found to be the most flexible, and while there's value in checking for a standard within a project, our recommendations are definitely not the most common ones. The signal to noise ratio might be a bit low, and this would invite a lot of requests for overrides/customization
- Model Access Ratio: We agree that all the have are gut checks on what the right ratios would be 😅 we may create a similar "governance coverage" model like the test and documentation coverage models we have, and this could be a good candidate!
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.
@dave-connors-3 , you were looking at those at some points. Are you OK with closing this issue with the changes you did or do you think there is more work to be done?
How about fct_public_models_without_exposures
?
In our teams, all public models have external access, and writing the exposures are sometimes missed. This new governance is useful to avoid the missing.
IMO, public models without exposures are rare, it should be into dbt_project_evaluator_exceptions.csv as exception or disable the rule. (maybe wrong)
Please tell us how do you think about it ..
Hi @sadahry , thanks for adding this suggestion!
I don't know if your rule would apply to a majority of use cases.
In my mind, public models are about the ability to ref()
those models from different dbt groups and/or projects, which is something different from "public" in the sense that those models are consumed outside of dbt by other people and tools.
Internally we define exposures on models that are not public and have public models without exposures.
What I would recommend instead here is for you to create your own custom model and test on top of the models the package is creating (you can check the existing models and tests for inspiration) and you can then run the project evaluator package with dbt build --select package:dbt_project_evaluator+
(with the +
after) to pick up the default models/tests and your custom ones
Understood. I'll create a custom model. Thanks!
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.
Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.