Investigate ReqMgr2 required changes to support StepChain with duplicate output module + datatier
Impact of the new feature ReqMgr2
Is your feature request related to a problem? Please describe. PPD has expressed interest in running fork-style StepChain workflows, hence generating the same datatier and output module in different steps of the workflow. This is currently not supported and there is even a protection for this in ReqMgr2.
If we were to allow such workflows to be created in the fork-style configuration, the problem that we would see in the agent is that output from different steps would be consider under the same merge task, hence incorrectly merging unrelated files.
Describe the solution you'd like With this ticket, we need to:
- investigate which changes would be required to support this StepChain fork-style workflow (likely affecting how the workload object is constructed with section nodes - workload -> task -> step -> node)
- whether it could be made backwards compatible (dealing either with new and old style workload object construction)
For reference, this ticket addresses the investigation on the agent side: https://github.com/dmwm/WMCore/issues/11630
Describe alternatives you've considered None
Additional context This PR might give some insight on the relevant developments in StepChain: https://github.com/dmwm/WMCore/pull/7998
Any update on this end? We find ourselves in the same situation that triggered this request in the first place (producing two MINIAODSIM output from the same workflow) again.
I was thinking (naively) that we could "trick it" with https://github.com/cms-sw/cmssw/pull/47501 : i.e. naming the output module differently for each output (one would be MINIAODSIMoutput, the other one MINIAODSIM1output). Would this work ?
Hi @vlimant , I am sorry that workarounds have to be created to work around this limitation.
This ticket hasn't been considered for this quarter, so there are no plans to work on this for the moment. We can definitely consider this for Q2 and it will depend on the communication with stakeholders and what our buffer can accommodate.
Nonetheless, provided that MINIAODSIM1output is consistently defined in, at least:
- workflow high level description (json file)
- job configuration (created by cmsDriver and uploaded to CouchDB during creation)
- (this I am not sure) the PSet attached to these job configuration
I think it should work. From what I recall, the confusion in the agent comes from the same task (with multiple steps) outputting different "data" under the same data tier AND output module.
thanks for the positive feedback ; we'll try it out with the output renaming
can you please help out with https://its.cern.ch/jira/browse/CMSPROD-264 to figure an apriory check on TaskChain that would prevent a StepChain conversion ? I believe one can look into the splitting document for duplicated Merge*output
Hi @vlimant , likely @hassan11196 might help with your request, since the TaskChain-to-StepChain conversion is usually managed by P&R.
This ticket hasn't been considered for this quarter, so there are no plans to work on this for the moment. We can definitely consider this for Q2 and it will depend on the communication with stakeholders and what our buffer can accommodate.
that'd be nice, as the trick works just fine, but is very involved / unmaintainable on the submission side