WMCore
WMCore copied to clipboard
R&D Investigate on pilot lifetime projected onto HL-LHC era
Impact of the new feature Place holder for R&D issue
Is your feature request related to a problem? Please describe. This is related to the list of TODO tasks in the Evaluation of the WM system for the HL-LHC scenario google document
Describe the solution you'd like The current pilot lifetime is set to 8 hours. We should investigate if changing this would be benefitial during HL-LHC era, either globally or for e.g.: HPC resources for example (e.g.: 48 hour pilots). If 8h needs to be changed, we should describe what needs to change; for instance this can be done in the reqmgr2 (but it is static now, though making it configurable would be trivial)
Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
Additional context Add any other context or screenshots about the feature request here.
This is actually a great idea, Kenyi. I would suggest an evaluation of a few workflows targetting job wallclocktime of: 4h, 8h, 12h, 16h.
A few metrics that come to my mind would be:
- avg job wallclocktime
- min job wallclocktime
- max job wallclocktime
- total workflow wallclocktime
- workflow turnaround (time from acquired to completed status)
- condor retry/failure rate
- wmagent retry/failure rate
- ?
@khurtado Kenyi, can you please apply the relevant labels (I guess only R&D) and fields of the project board (for the other R&D as well).