azkaban-plugins
azkaban-plugins copied to clipboard
Job Summary plugin job_id parsing does not work for Hadoop 2 job ids and URLs
In log-data.js, the job_id regexes expect job_<12 digits>_<4+ digits>
. However, on Hadoop 2, instead of <12 digits> of the form YYYYMMDDHHMM, the job_id appears appears to contain the milliseconds since epoch, which is variable in length and is currently 13 digits.
Also, in Hadoop 2, the job URL printed out in the logs does not contain job_
(which the current url regex is looking for) but instead looks something like http://<host>:<port>/proxy/application_<milliseconds_since_epoch>_<counter>
.
We should fix the regexes so that the job summary plugin will find the job ids and URLs.
Tracked by internal JIRA HADOOP-6977.