dbt-redshift
dbt-redshift copied to clipboard
[ADAP-381] Enable tagging of query_group in configs
Is this your first time submitting a feature request?
- [X] I have read the expectations for open source contributors
- [X] I have searched the existing issues, and I could not find an existing issue for this feature
- [X] I am requesting a straightforward extension of existing dbt-redshift functionality, rather than a Big Idea better suited to a discussion
Describe the feature
Much akin to dbt-snowflake's implementation of query tagging, I think it'd be great to allow users to set a query_group for their models. I'd imagine the implementation of this would be quite straightforward as the functionality is quite similar to that of Snowflake's, but hoping to get input there.
Describe alternatives you've considered
Not aware that there are any alternatives.
Who will this benefit?
Anybody who uses the Redshift query logs to perform analysis on their queries.
Are you interested in contributing this feature?
Certainly! Feels like it could be a good first issue.
Anything else?
No response
Good idea @trouze !
Agreed that this might be relatively straightforward especially having the query_tag
config in dbt-snowflake as prior art. Will label this as a good_first_issue
.
One way to see the relevant implementation of the feature and its tests in dbt-snowflake is searching in GitHub: https://github.com/dbt-labs/dbt-snowflake/search?q=query_tag
Personally, I prefer to do a git grep
with a local clone of the dbt-snowflake repo for this type of search:
git grep "query_tag" .
A design decision for refinement
The name of the actual configuration is a piece of the design that we'll need to refine:
-
query_group
vs.query_tag
Argument for query_group
:
-
query_group
is the vendor-specific name of the session setting in Redshift -
query_tag
is the vendor-specific name of the session setting in Snowflake
Argument for query_tag
:
- From what I can tell,
query_group
in Redshift andquery_tag
in Redshift basically do the same thing -
query_tag
is a pre-existing config name within dbt (for dbt-snowflake) so the naming and documentation would be more uniform to keep this name the same
Overall, I think either choice would work fine -- there are trade-offs either way.
A final option would be to allow query_tag
and query_group
to be aliases of each other in both dbt-redshift and dbt-snowflake.
Sweet!
I think the biggest question I have at the outset here is over materializations. In dbt-snowflake the set_query_tag()
method is called in each of the adapter-specific materializations. Obviously, dbt-redshift currently relies on the base dbt-core materializations supplied through the global_project. So the question is, does this necessitate the need for adapter specific materializations to be introduced to dbt-redshift, or is there another way to add this functionality without doing that?
To follow up, there is some development that maintainers can take a look at here.
Regarding question above, only other option I foresee would be to adjust logic in the create_table_as()
and create_view_as()
macros, but this would mean any pre or post-hooks would not be captured within the desired query group.
Nice progress @trouze !
So the question is, does this necessitate the need for adapter specific materializations to be introduced to dbt-redshift, or is there another way to add this functionality without doing that?
It seems like there would be two main ways to approach this:
- introduce adapter specific materializations to dbt-redshift
- introduce
query_tag
configuration to dbt-core (and most adapters would just raiseNotImplemented
(or something similar)
Approach #1 seems the least complicated in the short-term. Adapter specific materializations will probably be introduced by https://github.com/dbt-labs/dbt-redshift/issues/204 if they aren't introduced here first.