dbt-redshift icon indicating copy to clipboard operation
dbt-redshift copied to clipboard

[ADAP-381] Enable tagging of query_group in configs

Open trouze opened this issue 1 year ago • 5 comments

Is this your first time submitting a feature request?

  • [X] I have read the expectations for open source contributors
  • [X] I have searched the existing issues, and I could not find an existing issue for this feature
  • [X] I am requesting a straightforward extension of existing dbt-redshift functionality, rather than a Big Idea better suited to a discussion

Describe the feature

Much akin to dbt-snowflake's implementation of query tagging, I think it'd be great to allow users to set a query_group for their models. I'd imagine the implementation of this would be quite straightforward as the functionality is quite similar to that of Snowflake's, but hoping to get input there.

Describe alternatives you've considered

Not aware that there are any alternatives.

Who will this benefit?

Anybody who uses the Redshift query logs to perform analysis on their queries.

Are you interested in contributing this feature?

Certainly! Feels like it could be a good first issue.

Anything else?

No response

trouze avatar Mar 16 '23 20:03 trouze

Good idea @trouze !

Agreed that this might be relatively straightforward especially having the query_tag config in dbt-snowflake as prior art. Will label this as a good_first_issue.

One way to see the relevant implementation of the feature and its tests in dbt-snowflake is searching in GitHub: https://github.com/dbt-labs/dbt-snowflake/search?q=query_tag

Personally, I prefer to do a git grep with a local clone of the dbt-snowflake repo for this type of search:

git grep "query_tag" .

dbeatty10 avatar Mar 17 '23 14:03 dbeatty10

A design decision for refinement

The name of the actual configuration is a piece of the design that we'll need to refine:

  • query_group vs. query_tag

Argument for query_group:

  • query_group is the vendor-specific name of the session setting in Redshift
  • query_tag is the vendor-specific name of the session setting in Snowflake

Argument for query_tag:

  • From what I can tell, query_group in Redshift and query_tag in Redshift basically do the same thing
  • query_tag is a pre-existing config name within dbt (for dbt-snowflake) so the naming and documentation would be more uniform to keep this name the same

Overall, I think either choice would work fine -- there are trade-offs either way.

A final option would be to allow query_tag and query_group to be aliases of each other in both dbt-redshift and dbt-snowflake.

dbeatty10 avatar Mar 17 '23 14:03 dbeatty10

Sweet!

I think the biggest question I have at the outset here is over materializations. In dbt-snowflake the set_query_tag() method is called in each of the adapter-specific materializations. Obviously, dbt-redshift currently relies on the base dbt-core materializations supplied through the global_project. So the question is, does this necessitate the need for adapter specific materializations to be introduced to dbt-redshift, or is there another way to add this functionality without doing that?

trouze avatar Mar 17 '23 18:03 trouze

To follow up, there is some development that maintainers can take a look at here.

Regarding question above, only other option I foresee would be to adjust logic in the create_table_as() and create_view_as() macros, but this would mean any pre or post-hooks would not be captured within the desired query group.

trouze avatar Mar 20 '23 15:03 trouze

Nice progress @trouze !

So the question is, does this necessitate the need for adapter specific materializations to be introduced to dbt-redshift, or is there another way to add this functionality without doing that?

It seems like there would be two main ways to approach this:

  1. introduce adapter specific materializations to dbt-redshift
  2. introduce query_tag configuration to dbt-core (and most adapters would just raise NotImplemented (or something similar)

Approach #1 seems the least complicated in the short-term. Adapter specific materializations will probably be introduced by https://github.com/dbt-labs/dbt-redshift/issues/204 if they aren't introduced here first.

dbeatty10 avatar Mar 20 '23 20:03 dbeatty10