Add dataset event dataset dag run queue association
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.
Notes from talking to Ankit off-thread:
- I think adding an association table shouldn’t affect
triggering_dataset_events. SQLA loads relationships lazily (unless we make it; we don’t) so the new relation shouldn’t be loaded at all unless the user accesses it. They shouldn’t (it’s unsupported) but if they do they get an unavoidable performance penalty. - Right now we pass in all triggered events collected by DDRQ during the prior trigger and the current trigger to the downstream timetable, and let it come up with an appropriate data interval for the downstream DAG run. The logic is pretty obvious for ALL (default, current logic), but less so for ANY or anything more complicated. We might need a way for users to override that timetable function to generate a more appropriate data interval, but that will be handled in the future when the need comes up.
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.