WMCore Change how run number is defined for harvested root files in multiRun mode

Fixes #9690

Status

not-tested

Description

In short:

if harvesting MC data (be it in byRun or multiRun mode): run number is set to 1
if harvesting data in byRun mode, apply no change to the run number: so it takes it from the data harvested
if harvesting data in multiRun mode, force run to be 999999

Is it backward compatible (if not, which system it affects?)

no, it cannot be applied to workflows with harvesting jobs already created

Related PRs

none

External dependencies / deployment changes

none

Jun 15 '20 20:06 amaltaro

Jenkins results:

Unit tests: failed
- 1 new failures
- 1 tests no longer failing
Pylint check: failed
- 9 warnings and errors that must be fixed
- 15 comments to review
Pycodestyle check: succeeded
- 1 comments to review
Python3 compatibility checks: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/10112/artifact/artifacts/PullRequestReport.html

Jun 15 '20 21:06 cmsdmwmbot

Jenkins results:

Unit tests: succeeded
- 1 tests no longer failing
Pylint check: failed
- 9 warnings and errors that must be fixed
- 18 comments to review
Pycodestyle check: succeeded
- 1 comments to review
Python3 compatibility checks: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/10113/artifact/artifacts/PullRequestReport.html

Jun 15 '20 21:06 cmsdmwmbot

This will be more complicated than I initially foreseen. Run dependent MC really has run number > 1, where I thought that all that logic was internal to the CMSSW when processing data... Here is part of my harvesting job:

 'input_files': [{'checksums': {'adler32': '861551bd', 'cksum': '661964275'},
                  'events': 0,
                  'first_event': 0,
                  'last_event': 0,
                  'lfn': '/store/backfill/1/CMSSW_11_1_0_pre7/RelValTTbar_13UP18_RD/DQMIO/RECOPRMXUP18_PU25_RD_TC_MC_multiRun_June2020_Val_Alanv12-v11/00000/1ED13C52-B0E9-11EA-8109-D0CDE183BEEF.root',
                  'locations': set([]),
                  'merged': True,
                  'parents': set([]),
                  'runs': set([]),
                  'size': 68426312}],
 'jobType': 'Harvesting',
 'jobgroup': 555,
 'location': None,
 'mask': {'FirstEvent': None,
          'FirstLumi': None,
          'FirstRun': None,
          'LastEvent': None,
          'LastLumi': None,
          'LastRun': None,
          'inclusivemask': True,
          'runAndLumis': {315257: [[1, 36]]}},

so we need to find out a systematic way to identify such run-dependent MC files.

Jun 18 '20 06:06 amaltaro

Hi @amaltaro Do you mean we don't have a way to identify between data and MC on harvesting from wm side? Thanks.

Feb 22 '21 16:02 srimanob

Just to be clear, please correct me if I am wrong: right now, before this PR is merged,

MRH files, either data or MC, have a parameter in WMcore "runLimits", "-%s-%s" % (minRun, maxRun))[1], which is used in the dataset name for DQM. I am not sure how many of these have been uploaded to the DQM GUI, I can only find one of those in the development GUI, none in the Offline GUI. This one: https://tinyurl.com/ycj7luc9

which has RunNumber forced as 999999 in the DQM search box despite there is a mismatch between this and the runNumber displayed in the Menu of the DQM GUI (278017, the longest one in the range?), but dataset name keeps the run range used in the harvesting: /NoBPTX/Run2016F-23Sep2016-v1-277932-278193/DQMIO

This would be the desired behaviour for MRH in DQM GUI, so that DQM user can trace back directly from dataset name, which runs (a range) it contains, despite the search is performed by run = 999999 in the DQM search.

I see several ALCAPROMPT datasets uploaded in this way into the Offline DQM GUI too, all of them with runNumber forced to 999999, but different dataset name and different run displayed in the header of the GUI. E.g. /StreamExpress/Run2018A-PromptCalibProdSiStripGainsAAG-Express-v1-316702-316766/ALCAPROMPT https://tinyurl.com/yaz6vfyt So that they can be distinguished by dataset name (run range) and even by displayed Run Number (in the header of the GUI) despite all have 9999999

After https://github.com/dmwm/WMCore/pull/9746 is merged, we lose all the functionality defined above, and everytime a MRH root file is registered for an existing dataset name, it is overwritten no matter the range used in the harvesting
For single Run mode, always the run Number is kept

@ahmad3213 @emanueleusai @rvenditti please speak either if you agree or disagree

Thanks

[1] https://github.com/dmwm/WMCore/pull/9746/files#diff-3c13cdc9485083bb43b4e4d3d37f7310b878d36bc137ce2a7cf8f08de4e9daf0L181-R184

Mar 23 '22 08:03 jfernan2

Jenkins results:

Python3 Unit tests: succeeded
- 440 tests deleted
- 19 tests no longer failing
- 13 tests added
- 3 changes in unstable tests
Python3 Pylint check: failed
- 64 warnings and errors that must be fixed
- 5 warnings
- 343 comments to review
Pylint py3k check: failed
- 102 errors and warnings that should be fixed
- 79 warnings
Pycodestyle check: succeeded
- 447 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13176/artifact/artifacts/PullRequestReport.html

May 09 '22 13:05 cmsdmwmbot

Can one of the admins verify this patch?

Sep 30 '24 21:09 cmsdmwmbot