DataQualityDashboard icon indicating copy to clipboard operation
DataQualityDashboard copied to clipboard

Add ability to resume execution of DQD checks

Open ganisimov opened this issue 2 years ago • 1 comments

It happens that due to connection issues or other conditions executeDqChecks fails and progress is lost.

When new resume argument is set to TRUE, file in outputFile path is used as source of check results instead of actual processing. Missing check results or results reporting error are re-processed.

Note that outputFile must be set explicitely to make resuming to take effect. New results overwrite file in outputFile path.

Addresses #109

ganisimov avatar Dec 19 '22 11:12 ganisimov

I 100% want to add this feature, thanks @ganisimov for the contribution and @MaximMoinat for reviving the thread 😃

I've been thinking, however, that we might want to wait to add it in as part of the larger check-running workflow overhaul we've been discussing implementing in 2024. Part of the vision for that workflow includes moving away from the current "all-or-nothing" execution towards something more incremental that's aware of the relationships/dependencies between checks. I think that implementation will lend itself better to figuring out how/where to store incremental results, how to handle various failure modes, etc.

Would definitely love to discuss this in a WG meeting early next year!

katy-sadowski avatar Dec 30 '23 02:12 katy-sadowski