Dump output from `bhist -l` <lsfjobid>` to runpath
Is your feature request related to a problem? Please describe.
The output from bhist -l <jobid> on finished LSF jobs is too interesting not to leave easily accessible, and should be dumped to the runpath:
Job <289263>, User <havb>, Project <default>, Command <sleep 10>
Thu Apr 18 10:00:40: Submitted from host <st-grid03>, to Queue <normal>, CWD <$
HOME>;
Thu Apr 18 10:01:28: Dispatched to <st-rst14-03-05>, Effective RES_REQ <select[
(cs)&&(type == any )&&(mem>maxmem*1/12)] order[r15s:pg:bjo
bs] span[hosts=1] same[model] >;
Thu Apr 18 10:01:28: Starting (Pid 658);
Thu Apr 18 10:01:28: Running with execution home </private/havb>, Execution CWD
</private/havb>, Execution Pid <658>;
Thu Apr 18 10:01:38: Done successfully. The CPU time used is 0.1 seconds;
Thu Apr 18 10:02:01: Post job process done successfully;
MEMORY USAGE:
MAX MEM: 3.9 Gbytes; AVG MEM: 3.9 Gbytes
Summary of time in seconds spent in various states by Thu Apr 18 10:02:01
PEND PSUSP RUN USUSP SSUSP UNKWN TOTAL
48 0 10 0 0 0 58
Describe the solution you'd like Dump the output to some filename.
Describe alternatives you've considered* Do nothing.
The LSF stdout might be sufficient though, but must be fixed in #7695. Examine if there are differences when OOM strikes f.ex.
What should the filename be? @berland I have some ideas:
- bhist_job_summary.txt
- lsf_job_summary.txt
- job_summary.txt (would not be created for other queue systems than lsf anyways)
The one on the left is lsf stdout while the right one is the bhist long version.
As for filename, we already have <JOBNAME>.LSF-out for stdout, and we might get <JOBNAME>.LSF-err for stderr (that is a potential issue to write). To be in line with that system, what about <JOBNAME>.LSF-bhist-l ?
The lsf stdout already provides all the information found in bhist -l, so echoing the output to a file wouldn't give us anything extra.
One field that is not included in lsf stdout is Dispatched to <cluster_node>, Effective RES_REQ <select[(cs)&&(type==any)>.
Maybe getting the resource requirement string would be reason enough to keep the output? @berland
Yes, I think this is sufficient to warrant also outputting this. There might be other corner-case scenarios where this diff is changed too, and that is when it is the most interesting.