overwatch
overwatch copied to clipboard
Check that additional duration metrics are added to workflow runs
Currently runSucceeded events in auditlogs provide the following attributes:
- [“idInJob”, “jobId”, “jobTriggerType”, “orgId”, “runId”, “jobClusterType”, “jobTaskType”, “jobTerminalState”]
However, from the jobs API docs a couple of useful metrics should be available:
- setup_duration
- execution_duration
- cleanup_duration
If they can be added to the audit logs json schema, overhead analysis would be much simpler.
As reported by @psg2
@uucico -- do you recall, is this api data available in API v2.1 or v2.0? I think you're suggesting that the data is not available in audit logs but is available from the API, right?
We're looking into adding support for API v2.1 in OW release 0.6.2.1 due to #458 but if there are schema conflicts it will be delayed until 0.6.2.0
Hi @GeekSheikh, that's correct, though I have not looked into the audit logs recently to see if they added new fields. Did a quick test in the APIs and it's now available in v2.1, the relevant metrics show up on the task level. It also appears in the top-level object as it did in v2.0 but it only returns all zeroes in the test below. It seems API v.2.1 is the way to go to get these metrics.
{
"job_id": 29,
"run_id": 15857,
"creator_user_name": "[email protected]",
"number_in_job": 15857,
"original_attempt_run_id": 15857,
"state": {
"life_cycle_state": "TERMINATED",
"result_state": "SUCCESS",
"state_message": "",
"user_cancelled_or_timedout": false
},
"start_time": 1654524046057,
"setup_duration": 0,
"execution_duration": 0,
"cleanup_duration": 0,
"end_time": 1654524106428,
"trigger": "ONE_TIME",
"run_name": "DEJ Job",
"run_page_url": "https://adb-xxx.azuredatabricks.net/?o=799332275071854#job/29/run/15857",
"run_type": "JOB_RUN",
"tasks": [
{
"run_id": 16392,
"task_key": "DEJ_Job",
"spark_jar_task": {
"jar_uri": "",
"main_class_name": "xxx.xxx.xx.xx.xxxxx",
"parameters": [
"--class",
"xxx.xxx.xx.xx.xxxxx"
],
"run_as_repl": true
},
"existing_cluster_id": "0606-133302-k1ggyol1",
"libraries": [
{
"jar": "dbfs:/FileStore/jars/4eec47d0_77bd_4f31_8c74_fff0ca947b6f-DEJ_1_0_0-5c4cf.jar"
}
],
"state": {
"life_cycle_state": "TERMINATED",
"result_state": "SUCCESS",
"state_message": "",
"user_cancelled_or_timedout": false
},
"run_page_url": "https://adb-xxx.azuredatabricks.net/?o=799332275071854#job/29/run/16392",
"start_time": 1654524046094,
"setup_duration": 1000,
"execution_duration": 59000,
"cleanup_duration": 0,
"end_time": 1654524106314,
"cluster_instance": {
"cluster_id": "0606-133302-k1ggyol1",
"spark_context_id": "1450494052714607854"
},
"attempt_number": 0
}
],
"format": "MULTI_TASK"
}