Fix location of alpaka modification for particleFlowClusterHCAL
PR description:
Identified by @makortel, this change moves the modification of particleFlowClusterHCAL to the correct configuration file in accordance with the coding rules.
PR validation:
Tested the associated .42X matrix workflows.
If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:
Not a backport.
cms-bot internal usage
+code-checks
Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-49295/46656
A new Pull Request was created by @jsamudio for master.
It involves the following packages:
- RecoParticleFlow/PFClusterProducer (reconstruction)
@cmsbuild, @jfernan2, @mandrenguyen, @srimanob can you please review it and eventually sign? Thanks. @felicepantaleo, @hatakeyamak, @lgray, @mmarionncern, @rovere, @sameasy, @seemasharmafnal this is something you requested to watch as well. @ftenchini, @mandrenguyen, @sextonkennedy you are the release manager for this.
cms-bot commands are listed here
please test
+1
Size: This PR adds an extra 20KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-0d0802/49212/summary.html
COMMIT: 34edc03be36294e1a9f81212857b01314d79e98d
CMSSW: CMSSW_16_0_X_2025-11-03-1100/el8_amd64_gcc13
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/49295/49212/install.sh to create a dev area with all the needed externals and cmssw changes.
Comparison Summary
Summary:
- No significant changes to the logs found
- Reco comparison results: 4 differences found in the comparisons
- DQMHistoTests: Total files compared: 51
- DQMHistoTests: Total histograms compared: 3939953
- DQMHistoTests: Total failures: 50
- DQMHistoTests: Total nulls: 0
- DQMHistoTests: Total successes: 3939883
- DQMHistoTests: Total skipped: 20
- DQMHistoTests: Total Missing objects: 0
- DQMHistoSizes: Histogram memory added: 0.0 KiB( 50 files compared)
- Checked 218 log files, 188 edm output root files, 51 DQM output files
- TriggerResults: no differences found
hold
Just for me to have a moment to digest it
Pull request has been put on hold by @makortel
They need to issue an unhold command to remove the hold state or L1 can unhold it for all
Thanks @jsamudio. This PR is in the right direction, but I'm afraid it is not sufficient to fully resolve the policy violation.
Customization of
https://github.com/cms-sw/cmssw/blob/34edc03be36294e1a9f81212857b01314d79e98d/RecoParticleFlow/PFClusterProducer/python/pfClusterHBHEAlpaka_cff.py#L82
should be done where pfClusteringHBHEHFTask is defined, i.e. in RecoParticleFlow/PFClusterProducer/python/particleFlowCluster_cff.py. Same for
https://github.com/cms-sw/cmssw/blob/34edc03be36294e1a9f81212857b01314d79e98d/RecoParticleFlow/PFClusterProducer/python/pfClusterHBHEAlpaka_cff.py#L128
In addition, the legacyPFRecHitProducer and legacyPFClusterProducer would be better dealt such that the corresponding legacy producer is alpaka.toReplaceWith() with the LegacyPFRecHitProducer and LegacyPFClusterProducer that convert the SoAs to legacy data products.
If I read a configuration dump correctly, those would be https://github.com/cms-sw/cmssw/blob/34edc03be36294e1a9f81212857b01314d79e98d/RecoParticleFlow/PFClusterProducer/python/particleFlowRecHitHBHE_cfi.py#L10 and https://github.com/cms-sw/cmssw/blob/34edc03be36294e1a9f81212857b01314d79e98d/RecoParticleFlow/PFClusterProducer/python/particleFlowClusterHBHE_cfi.py#L19-L20
Actually with those changes (i.e. customizing the EDProducer type a given module label corresponds to instead of the downstream modules for their input these customizations
https://github.com/cms-sw/cmssw/blob/34edc03be36294e1a9f81212857b01314d79e98d/RecoParticleFlow/PFClusterProducer/python/particleFlowClusterHCAL_cfi.py#L75-L79
wouldn't be needed, because the default value for the clustersSource would already correspond to the Alpaka-based chain of modules.
@makortel Thanks for clarifying, I'll work on fixing it.
@makortel Regarding the second half of your comment, I think I remember why I did not use, for example,
alpaka.toReplaceWith(particleFlowClusterHBHEOnly, legacyPFClusterProducerHBHEOnly)
For the .423 workflow, the GPU vs legacy comparison is included as:
https://github.com/cms-sw/cmssw/blob/1cf33b6cf0c5a245f6d57205b888e5f6185c1064/Validation/RecoParticleFlow/python/pfCaloGPUComparisonTask_cfi.py#L4-L8
Which particleFlowClusterHBHEOnly is coming from the reconstruction_hcalOnlyLegacy sequence. I am not sure if having the same module labels is problematic in this case or if I can specify which sequence they are coming from? Do you have any advice?
Which
particleFlowClusterHBHEOnlyis coming from thereconstruction_hcalOnlyLegacysequence. I am not sure if having the same module labels is problematic in this case or if I can specify which sequence they are coming from? Do you have any advice?
What happens is that there is exactly one module with particleFlowClusterHBHEOnly label (regardless which Sequences/Tasks contain the module). I.e. the alpaka.toReplaceWith(particleFlowClusterHBHEOnly, legacyPFClusterProducerHBHEOnly) is a global customization.
Our established pattern for "Alpaka vs legacy" comparisons is to define another module label for the legacy chain (or DAG) of modules.
One part in my confusion in the case of pfClusterHBHEOnlyAlpakaComparison was that by just looking the consumed module labels, I thought legacyPFClusterProducerHBHEOnly referred to the chain of legacy modules, rather than the legacy-from-SoA chain.
For an example see e.g. https://github.com/cms-sw/cmssw/blob/master/RecoLocalCalo/EcalRecProducers/python/ecalMultiFitUncalibRecHit_cff.py that defines the legacy-from-SoA-converter like https://github.com/cms-sw/cmssw/blob/1cf33b6cf0c5a245f6d57205b888e5f6185c1064/RecoLocalCalo/EcalRecProducers/python/ecalMultiFitUncalibRecHit_cff.py#L31 and then the full legacy chain like https://github.com/cms-sw/cmssw/blob/1cf33b6cf0c5a245f6d57205b888e5f6185c1064/RecoLocalCalo/EcalRecProducers/python/ecalMultiFitUncalibRecHit_cff.py#L6 https://github.com/cms-sw/cmssw/blob/1cf33b6cf0c5a245f6d57205b888e5f6185c1064/RecoLocalCalo/EcalRecProducers/python/ecalMultiFitUncalibRecHit_cff.py#L42-L46
+code-checks
Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-49295/46851
- There are other open Pull requests which might conflict with changes you have proposed:
- File Configuration/PyReleaseValidation/python/upgradeWorkflowComponents.py modified in PR(s): #49339, #49383
- File RecoLocalCalo/Configuration/python/hcalGlobalReco_cff.py modified in PR(s): #46076
- File RecoLocalCalo/Configuration/python/hcalLocalReco_cff.py modified in PR(s): #45995, #46076
Pull request #49295 was updated. @AdrianoDee, @DickyChant, @antoniovagnerini, @cmsbuild, @ctarricone, @gabrielmscampos, @jfernan2, @mandrenguyen, @miquork, @nothingface0, @rseidita, @srimanob can you please check and sign again.
@makortel Sorry for the long delay. I refactored everything such that the original pfClusterHBHEAlpaka_cff.py is not necessary, and moved the customizations from Alpaka to (I think) the right locations. If there is still an issue, please let me know.
please test
-1
Failed Tests: RelVals RelVals-INPUT AddOn
Size: This PR adds an extra 52KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-0d0802/49515/summary.html
COMMIT: 40354ceb0896103b5428e883378068e0d8a6e584
CMSSW: CMSSW_16_0_X_2025-11-17-2300/el8_amd64_gcc13
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/49295/49515/install.sh to create a dev area with all the needed externals and cmssw changes.
Failed RelVals
----- Begin Fatal Exception 18-Nov-2025 09:25:58 CET-----------------------
An exception of category 'PluginNotFound' occurred while
[0] Constructing the EventProcessor
Exception Message:
Unable to find plugin 'PFRecHitHCALParamsESProducer@alpaka' in category 'CMS EDM Framework ESModule'. Please check spelling of name.
----- End Fatal Exception -------------------------------------------------
----- Begin Fatal Exception 18-Nov-2025 09:25:58 CET-----------------------
An exception of category 'PluginNotFound' occurred while
[0] Constructing the EventProcessor
Exception Message:
Unable to find plugin 'PFRecHitHCALParamsESProducer@alpaka' in category 'CMS EDM Framework ESModule'. Please check spelling of name.
----- End Fatal Exception -------------------------------------------------
----- Begin Fatal Exception 18-Nov-2025 09:38:56 CET-----------------------
An exception of category 'PluginNotFound' occurred while
[0] Constructing the EventProcessor
Exception Message:
Unable to find plugin 'PFRecHitHCALParamsESProducer@alpaka' in category 'CMS EDM Framework ESModule'. Please check spelling of name.
----- End Fatal Exception -------------------------------------------------
Failed RelVals-INPUT
- 139.005
139.005_AlCaPhiSym2021/step2_AlCaPhiSym2021.log
Failed AddOn Tests
----- Begin Fatal Exception 18-Nov-2025 09:23:57 CET-----------------------
An exception of category 'PluginNotFound' occurred while
[0] Constructing the EventProcessor
Exception Message:
Unable to find plugin 'PFRecHitHCALParamsESProducer@alpaka' in category 'CMS EDM Framework ESModule'. Please check spelling of name.
----- End Fatal Exception -------------------------------------------------
----- Begin Fatal Exception 18-Nov-2025 09:23:57 CET-----------------------
An exception of category 'PluginNotFound' occurred while
[0] Constructing the EventProcessor
Exception Message:
Unable to find plugin 'PFRecHitHCALParamsESProducer@alpaka' in category 'CMS EDM Framework ESModule'. Please check spelling of name.
----- End Fatal Exception -------------------------------------------------
----- Begin Fatal Exception 18-Nov-2025 09:23:57 CET-----------------------
An exception of category 'PluginNotFound' occurred while
[0] Constructing the EventProcessor
Exception Message:
Unable to find plugin 'PFRecHitHCALParamsESProducer@alpaka' in category 'CMS EDM Framework ESModule'. Please check spelling of name.
----- End Fatal Exception -------------------------------------------------
Expand to see more addon errors ...
These errors
Unable to find plugin 'PFRecHitHCALParamsESProducer@alpaka' in category 'CMS EDM Framework ESModule'. Please check spelling of name.
could be fixed by adding all the Alpaka-based ESProducers in one labeled Task, and then adding that Task into the _alpaka_pfClusteringHBHEHFTask. Then the framework always sees these ESProducers in some Task and won't enable them by default (this behavior is described in https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideAboutPythonConfigFile#Behavior_when_an_ESProducer_ESSo)
See e.g. https://github.com/cms-sw/cmssw/blob/318ca4d1abe3a7df429290937e95bf32dfac84c8/EventFilter/EcalRawToDigi/python/ecalDigis_cff.py#L20-L22 and https://github.com/cms-sw/cmssw/pull/48703 for examples
+code-checks
Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-49295/47012
Pull request #49295 was updated. @cmsbuild, @ctarricone, @gabrielmscampos, @jfernan2, @mandrenguyen, @nothingface0, @rseidita, @srimanob can you please check and sign again.
I've updated the branch to both address the earlier comments and change the style of the changes. So I keep both a full legacy sequence, and apply the alpaka modifier onto the normal sequence. This is instead of doing both in the same sequence, which was the last iteration.
@cmsbuild, please test
-1
Failed Tests: UnitTests AddOn
Size: This PR adds an extra 16KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-0d0802/49743/summary.html
COMMIT: 9d86caef027b62e141f9bb53dbc9e078cd4b650b
CMSSW: CMSSW_16_0_X_2025-12-01-1100/el8_amd64_gcc13
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/49295/49743/install.sh to create a dev area with all the needed externals and cmssw changes.
Failed Unit Tests
I found 1 errors in the following unit tests:
---> test TestDQMGUIUpload had ERRORS
Failed AddOn Tests
----- Begin Fatal Exception 01-Dec-2025 22:58:05 CET-----------------------
An exception of category 'EventSetupConflict' occurred while
[0] Calling beginJob
Exception Message:
two EventSetup Producers want to deliver type="reco::PFRecHitHCALParamsHostCollection" label=""
from record PFRecHitHCALParamsRecord. The two providers are
1) type="PFRecHitHCALParamsESProducer@alpaka" label="hltESPPFRecHitHCALParams"
2) type="PFRecHitHCALParamsESProducer@alpaka" label="pfRecHitHCALParamsESProducer"
Please either
remove one of these Producers
or find a way of configuring one of them so it does not deliver this data
or use an es_prefer statement in the configuration to choose one.
----- End Fatal Exception -------------------------------------------------
----- Begin Fatal Exception 01-Dec-2025 22:57:59 CET-----------------------
An exception of category 'EventSetupConflict' occurred while
[0] Calling beginJob
Exception Message:
two EventSetup Producers want to deliver type="reco::PFRecHitHCALParamsHostCollection" label=""
from record PFRecHitHCALParamsRecord. The two providers are
1) type="PFRecHitHCALParamsESProducer@alpaka" label="hltESPPFRecHitHCALParams"
2) type="PFRecHitHCALParamsESProducer@alpaka" label="pfRecHitHCALParamsESProducer"
Please either
remove one of these Producers
or find a way of configuring one of them so it does not deliver this data
or use an es_prefer statement in the configuration to choose one.
----- End Fatal Exception -------------------------------------------------
----- Begin Fatal Exception 01-Dec-2025 22:57:27 CET-----------------------
An exception of category 'EventSetupConflict' occurred while
[0] Calling beginJob
Exception Message:
two EventSetup Producers want to deliver type="reco::PFRecHitHCALParamsHostCollection" label=""
from record PFRecHitHCALParamsRecord. The two providers are
1) type="PFRecHitHCALParamsESProducer@alpaka" label="hltESPPFRecHitHCALParams"
2) type="PFRecHitHCALParamsESProducer@alpaka" label="pfRecHitHCALParamsESProducer"
Please either
remove one of these Producers
or find a way of configuring one of them so it does not deliver this data
or use an es_prefer statement in the configuration to choose one.
----- End Fatal Exception -------------------------------------------------
Expand to see more addon errors ...
Comparison Summary
Summary:
- You potentially added 7 lines to the logs
- ROOTFileChecks: Some differences in event products or their sizes found
- Reco comparison results: 8 differences found in the comparisons
- Reco comparison had 4 failed jobs
- DQMHistoTests: Total files compared: 53
- DQMHistoTests: Total histograms compared: 4269233
- DQMHistoTests: Total failures: 80
- DQMHistoTests: Total nulls: 0
- DQMHistoTests: Total successes: 4269133
- DQMHistoTests: Total skipped: 20
- DQMHistoTests: Total Missing objects: 0
- DQMHistoSizes: Histogram memory added: 0.0 KiB( 52 files compared)
- Checked 227 log files, 198 edm output root files, 53 DQM output files
- TriggerResults: no differences found
I'm puzzled about the failure. This PR defines the pfRecHitHCALParamsESProducer in
https://github.com/cms-sw/cmssw/blob/9d86caef027b62e141f9bb53dbc9e078cd4b650b/RecoParticleFlow/PFClusterProducer/python/particleFlowCluster_cff.py#L122
then inserts it into a Task
https://github.com/cms-sw/cmssw/blob/9d86caef027b62e141f9bb53dbc9e078cd4b650b/RecoParticleFlow/PFClusterProducer/python/particleFlowCluster_cff.py#L130-L134
that is included in two other Tasks
https://github.com/cms-sw/cmssw/blob/9d86caef027b62e141f9bb53dbc9e078cd4b650b/RecoParticleFlow/PFClusterProducer/python/particleFlowCluster_cff.py#L158
https://github.com/cms-sw/cmssw/blob/9d86caef027b62e141f9bb53dbc9e078cd4b650b/RecoParticleFlow/PFClusterProducer/python/particleFlowCluster_cff.py#L177
that are "enabled" with the alpaka modifier in
https://github.com/cms-sw/cmssw/blob/9d86caef027b62e141f9bb53dbc9e078cd4b650b/RecoParticleFlow/PFClusterProducer/python/particleFlowCluster_cff.py#L181-L182
This should result the framework to not use the pfRecHitHCALParamsESProducer (because it is put in a Task), unless alpaka modifier is enabled (it should be disabled in these tests).
Looking at the edmConfigDump shows that the pfRecHitHCALParamsESProducer would not be in any Task.
Investigating RelVal_HLT_RAW2DIGI_L1Reco_RECO.py directly, it has lines
from HLTrigger.Configuration.CustomConfigs import L1THLT,HLTRECO
process = HLTRECO(process)
that makes the process.alpaka_pfClusteringHBHEHF_esProducersTask empty.
The HLTRECO indeed removes all ESModules from Tasks :(
https://github.com/cms-sw/cmssw/blob/61a1113117cf3a7aafee41dee39a175dbda615f4/HLTrigger/Configuration/python/CustomConfigs.py#L107-L132
The reason why this pattern of placing Alpaka ESProducers in Tasks works e.g. in ECAL is that the HLT menu uses the same label (e.g. ecalElectronicsMappingHostESProducer) in the reconstruction configuration and the HLT menu.
I suppose that pattern should be followed with PF too. That would mean the HLT menu should be updated to use
pfRecHitHCALParamsRecordSourceinstead ofhltESSPFRecHitHCALParamsRecordpfRecHitHCALTopologyRecordSourceinstead ofhltESSPFRecHitHCALTopologyRecordpfRecHitHCALParamsESProducerinstead ofhltESPPFRecHitHCALParamspfRecHitHCALTopologyESProducerinstead ofhltESPPFRecHitHCALTopology
I suppose that pattern should be followed with PF too. That would mean the HLT menu should be updated to use
Using the same label for offline and HLT is in general heavily discourage. It is fine only if it can be guaranteed that, under no circumstanced whatsoever, the content of the offline and HLT configuration will never need to be different. Is this the case here?
It is fine only if it can be guaranteed that, under no circumstanced whatsoever, the content of the offline and HLT configuration will never need to be different. Is this the case here?
@jsamudio @hatakeyamak Could you comment?
If that is not the case, an alternative would be to specify different labels (of which one could be empty string) for the ESProducts for HLT and offline cases.
I think we can say we have no plan to make them differ between HLT and offline.
at least for these parameters.
That's not enough, they would need to be integrally identical in the full list of parameters, otherwise in HLT+RECO jobs one of the two would get overridden.