dbt-snowflake icon indicating copy to clipboard operation
dbt-snowflake copied to clipboard

[ADAP-1024] [Feature] Save tags to Snowflake using object tagging feature

Open kanomaxb opened this issue 2 years ago • 1 comments

Is this your first time submitting a feature request?

  • [x] I have read the expectations for open source contributors
  • [X] I have searched the existing issues, and I could not find an existing issue for this feature
  • [X] I am requesting a straightforward extension of existing dbt-snowflake functionality, rather than a Big Idea better suited to a discussion

Describe the feature

Besides dbt tags there should be possibility to define 'snowflake-tags' for tables and columns (and possibly schemas), which would be persisted to Snowflake as tags (https://docs.snowflake.com/en/user-guide/object-tagging).

Describe alternatives you've considered

We are aware of the dbt-snowflake-utils package by Montreal Analytics, which now has a macro that creates Snowflake tags from model/column meta.

It is not usable for tag-based masking though, because there is a time gap between materializing a table and creating a tag (currently it creates tags in on-run-end, so when all models are ready), which would create a serious vulnerability in PII protection. Therefore tags need to be a part of the CTAS (same as masking policies problem described in dbt-labs/dbt-adapters#85

Who will this benefit?

Tags are useful for monitoring, but also they enable a powerful feature called tag-based masking, which is getting more and more popular.

Are you interested in contributing this feature?

No response

Anything else?

This feature has already been requested (#104), but closed with suggestion, that the dbt-snowflake-utils macro is a sufficient alternative. It may be for some cases, but we take our PII protection too seriously to enable a time gap between table materialization and tag creation.

kanomaxb avatar Nov 09 '23 07:11 kanomaxb

This is also important for us, for exactly the same reasons as yours. We currently solve this using a custom materialization, but that comes with its own downsides of course.

Implementation is pretty straightforward. For the definition, I like the names dbt-snowflake-utils, though I'd put them at the top level:

# schema.yml

models:
  - name: ACCOUNT
    +schema: FINANCE
    database_tags:
      accounting_row_string: a

    columns:
      - name: ACCOUNT_NAME
        database_tags:
          accounting_col_string: b

michael-the1 avatar Feb 08 '24 13:02 michael-the1