Clear downstream data automatically when reseting the task
Feature Request
Is your feature request related to a problem? Please describe:
Let users clear downstream data manually may be troublesome.
Describe the feature you'd like:
DM can clear downstream data automatically when resetting the task (with special argument for start-task) because it know which tables can be cleared via block-allow-list and route-rules.
Describe alternatives you've considered:
Teachability, Documentation, Adoption, Migration Strategy:
maybe user would depend on this feature wrongly, such as for t* in BA list, upstream has t1~t3, downstream has t1~t5, it's both a little reasonable to remove t4~t5 or not 🤔
maybe user would depend on this feature wrongly, such as for
t*in BA list, upstream hast1~t3, downstream hast1~t5, it's both a little reasonable to removet4~t5or not 🤔
we drop tables based on upstream names. in your example, we should only drop t1~t3.
in other words, we only drop tables we are planning to migrate.
we drop tables based on upstream names. in your example, we should only drop
t1~t3.in other words, we only drop tables we are planning to migrate.
maybe in last sync, user created t1~t9 and has synced to downstream. Then user stop and cleaned all upstream tables, (maybe tables are created by some application) then start that application. Before this sync, only t1~t3 is created in upstream, t4~t9 will be created sometime later and we don't have enough confidence to drop them