dbt-project-evaluator icon indicating copy to clipboard operation
dbt-project-evaluator copied to clipboard

WIP (open for comments) - Add metrics for num rows and SQL complexity for models

Open b-per opened this issue 3 months ago • 0 comments

This is a:

  • [ ] bug fix PR with no breaking changes
  • [x] new functionality

Link to Issue

Closes #426

Description & motivation

This PR (not finished) adds 2 metrics to stg_nodes

  • the number of lines
  • the complexity of the SQL code
    • this complexity is calculated based on some regexes, searching for "tokens" associated with a given cost

I wanted to add those to help teams:

  • find out which models are too complex, so they can be flagged for refactoring
  • define home made rules on how long a model can be
  • combine number of inputs, number of outputs, number of lines and SQL complexity to create an "overall metric" on how important a model is in a given project
    • then, people could set extra custom DPE tests on those models as well

To Do

If we agree that this is something we want to add, we will need to update the docs

  • mentioning the new columns maybe (or we could add new tests on those columns, e.g. models should have less than 300 lines)
  • mentioning the new variables

Checklist

  • [ ] I have verified that these changes work locally on the following warehouses (Note: it's okay if you do not have access to all warehouses, this helps us understand what has been covered)
    • [ ] BigQuery
    • [ ] Postgres
    • [ ] Redshift
    • [ ] Snowflake
    • [ ] Databricks
    • [x] DuckDB
    • [ ] Trino/Starburst
  • [ ] I have updated the README.md (if applicable)
  • [ ] I have added tests & descriptions to my models (and macros if applicable)

b-per avatar Apr 03 '24 09:04 b-per