overwatch
overwatch copied to clipboard
Refactor lookups in Silver Job Runs
[ Original description moved here in #1256. ]
The net effect of this PR is to refactor a number of transformations that are part of Silver Job Runs (module 2011) so that the Spark jobs and stages are labelled in a useful way. Only the last NamedTransformation before a Spark action in a chain has the desired effect, so ending a NamedTransformation with an action or performing an action immediately after applying one is the sensible application of this feature.
Next steps
Further gains in resource utilization and time efficiency may be possible in the subsequent phases of the JR module (2011):
0820_release should be moved to the tip of main before completing this PR, therefore I am leaving it in draft status for now.
BTW, there will be merge conflicts. I am perfectly willing to help with those. I made this mess! LMK.
@sriram251-code, the code that changes the values of the Spark UI Job Group IDs has been commented out in this branch per your recommendation (see https://github.com/databrickslabs/overwatch/pull/1253/commits/9fe9f8cedfc1bd988d5c56f44f1ffb55b3536935). I would like to understand the scenarios when the Job Group IDs are the only place to extract certain tokens/IDs. Is it possible to enumerate these scenarios and map the flow of those tokens through the ETL to the target table(s)?
closes #1228
please do not merge this into 0820_release until #1223 is closed per this comment there.
to follow #1228
Quality Gate passed
Issues
10 New issues
0 Accepted issues
Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code
