cmssw icon indicating copy to clipboard operation
cmssw copied to clipboard

Update of Heavy Ion Vertexing for pp_on_AA and XeXe eras

Open abaty opened this issue 2 years ago • 19 comments

The vertexing used in 2017/2018 heavy ion AA collisions worked well in most collisions (10-90% centralities), but was found to have deficiencies in very-low and very-high occupancy events. This PR seeks to fix both of these issues before the start of the HI Run3.

In low-occupancy events, vertex inefficiency was caused by track filters which could lower the number of accepted tracks in the event to a number <2. This PR removes the track filter requirement if the event has <10 filtered tracks to increase efficiency. This change is done at the python configuration level.

In high-occupancy events, a fairly large amount of vertex splitting is found and the cpu time consumed is quite large because of the large number of tracks passing the filters. This PR raises the filter pt threshold if >1000 filtered tracks are found in the event. This causes the vertexing to run ~40% faster and reduces vertex splitting, while only impacting the vertex resolution minimally. This change results in small modifications to the C++ code.

A summary of performance can be seen in these slides shown in the heavy ion tracking group, as well as PAG meeting: here

PR validation:

The PR was validated by rerunning the vertexing for 2018 minimum-bias PbPb MC and data and comparing the output collection to the nominal cases. One vertex is expected per event in these samples.

@CesarBernardes @mandrenguyen

abaty avatar May 27 '22 06:05 abaty

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-38097/30211

  • This PR adds an extra 20KB to repository

cmsbuild avatar May 27 '22 06:05 cmsbuild

A new Pull Request was created by @abaty for master.

It involves the following packages:

  • RecoVertex/PrimaryVertexProducer (reconstruction)

@jpata, @cmsbuild, @clacaputo, @slava77 can you please review it and eventually sign? Thanks. @GiacomoSguazzoni, @JanFSchulte, @rovere, @VinInn, @mmusich, @mtosi, @dgulhan this is something you requested to watch as well. @perrotta, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

cmsbuild avatar May 27 '22 06:05 cmsbuild

@werdmann FYI

mmusich avatar May 27 '22 14:05 mmusich

@cmsbuild please test

clacaputo avatar Jun 01 '22 12:06 clacaputo

This causes the vertexing to run ~40% faster and reduces vertex splitting, while only impacting the vertex resolution minimally.

Hi @abaty , do you happen to have a detailed result obtained profiling? How do you measure the ~40% decrease? thanks

clacaputo avatar Jun 01 '22 12:06 clacaputo

@clacaputo I ran a simple CMSSW job containing only the vertexing module on a sample of 25k MB events. The results (from the edm timing service module) for different Ntrk cuts are shown in slide 5 of the slides linked in the main PR description. The nominal case is ~0.33s/evt while this PR lowers it to around 0.2s/evt.

I believe this is consistent with what we saw in 2018, where the central event vertexing in particular was eating up a significant amount of CPU time.

abaty avatar Jun 01 '22 14:06 abaty

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-39bb58/25162/summary.html COMMIT: 96a3e7880525f5af878d5c646ce95490b647b553 CMSSW: CMSSW_12_5_X_2022-06-01-1100/el8_amd64_gcc10 User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/38097/25162/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 718 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 3649923
  • DQMHistoTests: Total failures: 564
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3649337
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
  • Checked 208 log files, 45 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

cmsbuild avatar Jun 01 '22 18:06 cmsbuild

@clacaputo We did expect a small effect on generalTracks because of the P.V. producer propagating via a clone to firstStepPrimaryVertices and firstStepPrimaryVerticesPreSplitting. However, what you show is a bit larger than what we were expecting.

We will investigate this in more detail and get back to you soon.

abaty avatar Jun 03 '22 17:06 abaty

Hi @abaty , this is a kindly reminder that if you want this feature in pre4, it should be signoff by 24.07

clacaputo avatar Jul 20 '22 09:07 clacaputo

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-38097/31315

  • This PR adds an extra 28KB to repository

cmsbuild avatar Jul 28 '22 15:07 cmsbuild

Pull request #38097 was updated. @jpata, @cmsbuild, @clacaputo can you please check and sign again.

cmsbuild avatar Jul 28 '22 15:07 cmsbuild

@cmsbuild please test

clacaputo avatar Jul 28 '22 15:07 clacaputo

Apologies for the delay, but we would like to resurrect this PR now. Our studies indicate that the tracking differences were indeed related to the propagation of changes to the firstStepPrimaryVertices(Presplitting) modules. To limit these differences, we now effectively turn off the pT-threshold scaling in central events for these modules, so only tiny differences in the most peripheral events might be expected. All changes described in the initial PR summary are still applied to the final vertexing module (OfflinePrimaryVertices).

@mandrenguyen

abaty avatar Jul 28 '22 15:07 abaty

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-39bb58/26511/summary.html COMMIT: e07a6ee2af08e890defeb58e71ccfd4bc1855592 CMSSW: CMSSW_12_5_X_2022-07-28-1100/el8_amd64_gcc10 User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/38097/26511/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 242 differences found in the comparisons
  • DQMHistoTests: Total files compared: 51
  • DQMHistoTests: Total histograms compared: 3668050
  • DQMHistoTests: Total failures: 27
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3668001
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 50 files compared)
  • Checked 210 log files, 47 edm output root files, 51 DQM output files
  • TriggerResults: no differences found

cmsbuild avatar Jul 28 '22 19:07 cmsbuild

Hi @abaty , inspecting the reco differences, there aren't any more diff in generalTracks and all the other (but one, more later) seems in line with the PR and can be classified as fluctuation/bin-migration.

There is only one plot that seems suspicious to me recoPFCandidates_akCs4PFJets_pfParticlesCs_reRECO_obj_phi, where the fluctuation happens only for the positive values of phi. I can't explain why it is happening, maybe you have an explanation.

all_OldVSNew_RunHI2018wf140p56c_recoPFCandidates_akCs4PFJets_pfParticlesCs_reRECO_obj_phi

clacaputo avatar Aug 03 '22 13:08 clacaputo

Hi @abaty , inspecting the reco differences, there aren't any more diff in generalTracks and all the other (but one, more later) seems in line with the PR and can be classified as fluctuation/bin-migration.

There is only one plot that seems suspicious to me recoPFCandidates_akCs4PFJets_pfParticlesCs_reRECO_obj_phi, where the fluctuation happens only for the positive values of phi. I can't explain why it is happening, maybe you have an explanation.

all_OldVSNew_RunHI2018wf140p56c_recoPFCandidates_akCs4PFJets_pfParticlesCs_reRECO_obj_phi

This difference in the constituents of akCs4PFJets is also shown in the jets themselves: all_OldVSNew_RunHI2018wf140p56c_recoPFJets_akCs4PFJets__reRECO_obj_phi

I suspect that in a particular event the jet clustering simply plays out a bit differently, and it just happens to be localized on one side. This looks like only a handful of events. Is is straight-forward to increase the test to like 100 events or so?

mandrenguyen avatar Aug 04 '22 11:08 mandrenguyen

I suspect that in a particular event the jet clustering simply plays out a bit differently, and it just happens to be localized on one side. This looks like only a handful of events. Is is straight-forward to increase the test to like 100 events or so?

Unfortunately, you can't change the number of events in the bot. But you can do it locally, running the wf for baseline and baseline+PR and using validateJR.sh script. More info can be found here

clacaputo avatar Aug 04 '22 14:08 clacaputo

@clacaputo I ran 200 events using a different input file from the same dataset, checking that it's also marked as good in the json. Unfortunately, I could not access the default input files, which are only at T1. Anyway, with higher stats I see that the small differences that are observed in the phi distribution that you pointed to show up for both positive and negative phi.

image image

mandrenguyen avatar Aug 08 '22 13:08 mandrenguyen

@mandrenguyen thanks for the check. From my side, I'm just waiting for @abaty to implement the suggestion made by Marco and then I'll be ready to sign

clacaputo avatar Aug 08 '22 14:08 clacaputo

@clacaputo Apologies for the delay; I was on vacation. Marco's comments should be addressed now.

abaty avatar Aug 17 '22 21:08 abaty

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-38097/31627

  • This PR adds an extra 20KB to repository

Code check has found code style and quality issues which could be resolved by applying following patch(s)

  • code-format: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-38097/31627/code-format.patch e.g. curl -k https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-38097/31627/code-format.patch | patch -p1 You can also run scram build code-format to apply code format directly

cmsbuild avatar Aug 17 '22 21:08 cmsbuild

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-38097/31629

  • This PR adds an extra 20KB to repository

cmsbuild avatar Aug 18 '22 04:08 cmsbuild

Pull request #38097 was updated. @jpata, @cmsbuild, @mandrenguyen, @clacaputo can you please check and sign again.

cmsbuild avatar Aug 18 '22 04:08 cmsbuild

@cmsbuild please test

jpata avatar Aug 18 '22 06:08 jpata

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-39bb58/26898/summary.html COMMIT: 87788437c603484e9689f9b1432266eac9c94fa5 CMSSW: CMSSW_12_5_X_2022-08-17-2300/el8_amd64_gcc10 User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/38097/26898/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 158 differences found in the comparisons
  • DQMHistoTests: Total files compared: 51
  • DQMHistoTests: Total histograms compared: 3692500
  • DQMHistoTests: Total failures: 55
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 3692422
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.004 KiB( 50 files compared)
  • DQMHistoSizes: changed ( 312.0 ): 0.004 KiB MessageLogger/Warnings
  • Checked 212 log files, 49 edm output root files, 51 DQM output files
  • TriggerResults: no differences found

cmsbuild avatar Aug 18 '22 11:08 cmsbuild

+reconstruction

  • vertexing cuts for HI
  • changes only in HI workflows

jpata avatar Aug 22 '22 08:08 jpata

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

cmsbuild avatar Aug 22 '22 08:08 cmsbuild

@abaty please apply the two fixes suggested by @jpata in https://github.com/cms-sw/cmssw/pull/38097#pullrequestreview-1080076974 and https://github.com/cms-sw/cmssw/pull/38097#pullrequestreview-1080077669

Since this PR only needs the reco signature, applying only those two fixes will not delay significantly the merging in the release.

perrotta avatar Aug 22 '22 09:08 perrotta

+1

  • Let have this merged, so that it can enter pre5
  • Fixes in https://github.com/cms-sw/cmssw/pull/38097#pullrequestreview-1080076974 and https://github.com/cms-sw/cmssw/pull/38097#pullrequestreview-1080077669 will be applied in a forthcoming bug-fix PR

perrotta avatar Aug 23 '22 06:08 perrotta