dbt-snowflake-monitoring
dbt-snowflake-monitoring copied to clipboard
Model the Snowflake access history
We need to model the snowflake.account_usage.access_history, since it contains a bunch of nested JSON columns that aren't easy to query. We need to be able to answer questions like:
- most expensive queries against a particular table with poor pruning (help inform clustering keys)
- most frequently executed queries against a table (help inform clustering keys)
- tables that haven't been accessed in last X days
Bunch of good work in here for reference: https://github.com/jaysobel/dbt-snowflake-queries. Need to connect with Jay, he had found some gotchas with these models.
a really cool extension of this would be to use an AST parser or SQL fluff to include columns like:
- array of columns used in join condition
- array of columns used in where clause
This would be useful for providing clustering key recommendations