dbt-core
dbt-core copied to clipboard
[CT-985] [Feature] on-run-end hook and results variable support for source freshness command
Is this your first time submitting a feature request?
- [X] I have read the expectations for open source contributors
- [X] I have searched the existing issues, and I could not find an existing issue for this feature
- [X] I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion
Describe the feature
- Currently
dbt source freshness
does not support running an on-run-end hook. - Other dbt commands like
dbt run
,dbt test
,dbt snapshot
and more already support running an on-run-end hook and getting results jinja variable as their context. - Using an on-run-end hook and the results variable in the
dbt source freshness
command, we can upload the source freshness results to the data warehouse. This could be used for displaying source freshness results on a dashboard, monitoring, and alerting.
Describe alternatives you've considered
- The
dbt source freshness
command already writes the freshness results to a file called sources.json. - A possible alternative is to process this json file to get the source freshness results.
- This solution requires implementing an external solution that will process this json and automatically upload the results to the data warehouse after every execution of the
dbt source freshness
command. - This alternative is harder to implement and requires implementing code outside of the dbt project.
- Additionally, this solution requires orchestrating this new tool and giving it access to the same project so it can extract the source freshness results from the sources.json file.
- Support for on-run-end and results variable will simplify this dramatically, enabling it to be implemented in the dbt project itself.
Who will this benefit?
- dbt users who rely on on-run-end hook and the results variable to monitor their dbt commands, executions and results.
- dbt users who want to build dashboards to present their source freshness results.
- dbt users who rely on dbt packages like Elementary to upload their dbt artifacts and monitor their dbt projects.
Are you interested in contributing this feature?
Yes
Anything else?
No response
@lostmygithubaccount is something I can contribute to?
@jtalmi apologies I missed that comment! I've been messing around with how I check my GitHub notifications and I think it fell through the cracks. We would welcome contribution on this issue and can likely provide some guidance if needed.
@oravi thank you for opening. dbt source freshness
is a bit different than the other commands that support on-run-end hooks, but we don't see any reason not to allow this. That said, it's likely not something we can prioritize in the near future and so we would be looking for help with the community (it sounds like @jtalmi might be up for it!). We'll leave this issue open on the backlog and mark it as help wanted.
The on-run-*
hook + context behavior is defined in the safe_run_hooks
, before_run
, and after_run
methods of the RunTask
:
https://github.com/dbt-labs/dbt-core/blob/a3b018fd3b0c789c976e0e189788471edef8acb1/core/dbt/task/run.py#L318-L393
https://github.com/dbt-labs/dbt-core/blob/a3b018fd3b0c789c976e0e189788471edef8acb1/core/dbt/task/run.py#L418-L439
The reason FreshnessTask
doesn't run hooks today is that it inherits from the GraphRunnableTask
(which doesn't know about on-run-*
hooks), rather than the RunTask
. To turn on hooks for source freshness
, that hook code might need to move back onto GraphRunnableTask
to accommodate. If that's the approach taken, CompileTask
would need to disable those hook methods in order to preserve existing behavior (= not running on-run-*
hooks).
Thanks @jtcohen6 ! could you please share a bit more context about GraphRunnableTask and RunTask? What's the unique purpose of each one of them? Specifically curious to learn why FreshnessTask
does not inherit from RunTask
.
Also in case we will go with the suggested approach - as you mentioned I saw that CompileTask
inherits from GraphRunnableTask
and RunTask
inherits from CompileTask
. When you say that we need to disable those hooks to avoid running them during compile - you mean we just need to override them inside CompileTask
and do nothing inside their implementation (just return). Then in RunTask
should we specifically call the implementation of GraphRunnableTask
?
Thanks for all your help here!
Hi @jtcohen6 @lostmygithubaccount, I'm thinking of picking this task up soon. I would appreciate a confirmation that this feature could be merged once implemented and reviewed. Thanks.
hi @elongl, Jeremy or one of the engineers should be able to answer for sure -- I think so, though there are some ongoing internal API refactors for v1.5 that could affect this
@oravi @elongl Sorry for the long delay getting back to you!
Our current task logic is implemented with a lot of class inheritance, rather than composition, which makes this logic much trickier to follow than it should be. That is something we'd be hoping to improve as part of our API refactors over the next several months, but we haven't touched this part of the codebase just yet.
I think the desirable outcomes here are:
- Move logic on running hooks up into
GraphRunnableTask
- Use that same logic in
FreshnessTask
, which should run hooks (this issue) - Disable that logic in
CompileTask
, which shouldn't run hooks
If one of you wanted to open a PR for that, I'd be happy to see it merged!
CompileTask
, which shouldn't run hooks
This is status quo & my opinion — does anyone have a different opinion?
Hi @jtcohen6 👋
I'd be happy to contribute this feature, tho i have a question about the solution you suggested:
As i look at the code now, it seems that more classes inherit from GraphRunnableTask
(like CloneTask
and ListTask
), so your suggested solution of moving the logic to it and disabling it in the inheriting classes sounds a bit messy, so i thought of a few solutions:
- Make
FreshnessTask
inherit fromRunTask
as well, tho there might be a reason i do not fully understand that explains why this cannot be the case. - Add a new class to the hierarchy, something like
HooksTask
that exists lower in the hierarchy tree, and bothRunTask
andFreshnessTask
inherit.
Went ahead and opened this PR that implements this suggested solution:
- Make
FreshnessTask
inherit fromRunTask
as well, tho there might be a reason i do not fully understand that explains why this cannot be the case.
Functionally it seems to work fine.