dbt-databricks
dbt-databricks copied to clipboard
Model level retry
Describe the feature
Hi @benc-db , Does DBT currently support model level retry? I didn't find information in DBT website. But I think this can be a useful feature, because sometimes the error is fortuitous. So if we can have model level retry, we can config a retry number, and DBT can help us re-submit the model to Databricks.
Describe alternatives you've considered
Just fail-fast, then use "dbt retry" to rerun all models.
Additional context
Please include any other relevant context here.
Who will this benefit?
What kind of use case will this feature be useful for? Please be specific and provide examples, this will help us prioritize properly.
Are you interested in contributing this feature?
Yes, First can you help confirm currently we don't have this.
There are a number of retries that happen before you even see an error, so it kind of depends on what sort of error that's surfacing. Do you have an example of a transient error that would/could succeed on retry? For many transient errors, the retries are just invisible to the user.
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue.
I would also like to see model-level retry.
My use case is that we have ~2000 production models that run daily, and multiple days per week we get a Databricks error on one job saying that the job can't be scheduled (e.g. [INTERNAL_ERROR] Query could not be scheduled: HTTP Response code: 503. Please try again later. SQLSTATE: XX000).
I'd like to automatically re-try running such failed models a maximum of X times before failing the entire pipeline.