cmssw
cmssw copied to clipboard
[RECO-UPGRADE] [GCC12] Disable cuda builds if cuda does not support gcc version
Disabled building cuda tests/binaries if cuda does not support gcc version e.g. currently cuda with gcc12 does not work.
+code-checks
Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-38396/30596
- This PR adds an extra 12KB to repository
A new Pull Request was created by @smuzaffar (Malik Shahzad Muzaffar) for master.
It involves the following packages:
- RecoLocalCalo/HGCalRecProducers (upgrade, reconstruction)
@clacaputo, @cmsbuild, @AdrianoDee, @srimanob, @slava77, @jpata can you please review it and eventually sign? Thanks. @edjtscott, @vandreev11, @sethzenz, @bsunanda, @felicepantaleo, @rovere, @lgray, @cseez, @apsallid, @pfs, @lecriste, @hatakeyamak, @trtomei, @ebrondol, @beaucero this is something you requested to watch as well. @perrotta, @dpiparo, @qliphy you are the release manager for this.
cms-bot commands are listed here
please test for el8_amd64_gcc12
please test
-1
Failed Tests: Build
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-f03205/25580/summary.html
COMMIT: fa0759e226a44e8490bcba3f08270b19d9c6539a
CMSSW: CMSSW_12_5_X_2022-06-15-2300/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/38396/25580/install.sh
to create a dev area with all the needed externals and cmssw changes.
Build
I found compilation error when building:
/cvmfs/cms-ib.cern.ch/nweek-02737/el8_amd64_gcc12/external/gcc/12.1.1-bf4aef5069fdf6bb6f77f897bcc8a6ae/bin/../lib/gcc/x86_64-redhat-linux-gnu/12.1.1/../../../../x86_64-redhat-linux-gnu/bin/ld: tmp/el8_amd64_gcc12/src/RecoLocalCalo/HGCalRecProducers/plugins/RecoLocalCaloHGCalRecProducersPlugins/HEFRecHitGPUtoSoA.cc.o: in function `HEFRecHitGPUtoSoA::acquire(edm::Event const&, edm::EventSetup const&, edm::WaitingTaskWithArenaHolder) [clone .cold]': HEFRecHitGPUtoSoA.cc:(.text.unlikely+0xbe): undefined reference to `KernelManagerHGCalRecHit::~KernelManagerHGCalRecHit()' /cvmfs/cms-ib.cern.ch/nweek-02737/el8_amd64_gcc12/external/gcc/12.1.1-bf4aef5069fdf6bb6f77f897bcc8a6ae/bin/../lib/gcc/x86_64-redhat-linux-gnu/12.1.1/../../../../x86_64-redhat-linux-gnu/bin/ld: tmp/el8_amd64_gcc12/src/RecoLocalCalo/HGCalRecProducers/plugins/RecoLocalCaloHGCalRecProducersPlugins/HeterogeneousHGCalHEFCellPositionsConditions.cc.o: in function `HeterogeneousHGCalHEFCellPositionsConditions::getHeterogeneousConditionsESProductAsync(CUstream_st*) const': HeterogeneousHGCalHEFCellPositionsConditions.cc:(.text+0x7e4): undefined reference to `KernelManagerHGCalCellPositions::KernelManagerHGCalCellPositions(unsigned long const&)' /cvmfs/cms-ib.cern.ch/nweek-02737/el8_amd64_gcc12/external/gcc/12.1.1-bf4aef5069fdf6bb6f77f897bcc8a6ae/bin/../lib/gcc/x86_64-redhat-linux-gnu/12.1.1/../../../../x86_64-redhat-linux-gnu/bin/ld: HeterogeneousHGCalHEFCellPositionsConditions.cc:(.text+0x7f0): undefined reference to `KernelManagerHGCalCellPositions::fill_positions(hgcal_conditions::HeterogeneousHEFCellPositionsConditionsESProduct const*)' collect2: error: ld returned 1 exit status gmake: *** [tmp/el8_amd64_gcc12/src/RecoLocalCalo/HGCalRecProducers/plugins/RecoLocalCaloHGCalRecProducersPlugins/libRecoLocalCaloHGCalRecProducersPlugins.so] Error 1 Leaving library rule at src/RecoLocalCalo/HGCalRecProducers/plugins Entering library rule at RecoLocalCalo/HGCalRecProducers >> Compiling /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_12_5_X_2022-06-15-2300/src/RecoLocalCalo/HGCalRecProducers/src/ComputeClusterTime.cc >> Compiling /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_12_5_X_2022-06-15-2300/src/RecoLocalCalo/HGCalRecProducers/src/HGCalRecHitWorkerFactory.cc
please test
so looks like the cuda code is not optional. Looks like there are some *GPU*.cc e.g EERecHitGPU.cc , EERecHitGPUtoSoA.cc
files which requires cuda code
EERecHitGPU.cc:(.text+0x208c): undefined reference to `KernelManagerHGCalRecHit::run_kernels(KernelConstantData<HGCeeUncalibRecHitConstantData> const*, CUstream_st* const&)'
EERecHitGPU.cc:(.text+0x20d9): undefined reference to `KernelManagerHGCalRecHit::~KernelManagerHGCalRecHit()'
Is the code in *GPU*.cc
really required for non gpu runs? If not then we can skip these files too
+1
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-f03205/25585/summary.html
COMMIT: fa0759e226a44e8490bcba3f08270b19d9c6539a
CMSSW: CMSSW_12_5_X_2022-06-16-2300/el8_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/38396/25585/install.sh
to create a dev area with all the needed externals and cmssw changes.
Comparison Summary
Summary:
- No significant changes to the logs found
- Reco comparison results: 0 differences found in the comparisons
- DQMHistoTests: Total files compared: 50
- DQMHistoTests: Total histograms compared: 3659074
- DQMHistoTests: Total failures: 2
- DQMHistoTests: Total nulls: 0
- DQMHistoTests: Total successes: 3659050
- DQMHistoTests: Total skipped: 22
- DQMHistoTests: Total Missing objects: 0
- DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
- Checked 208 log files, 45 edm output root files, 50 DQM output files
- TriggerResults: no differences found
Is the code in
*GPU*.cc
really required for non gpu runs? If not then we can skip these files too
Naively, I'd say no, they shouldn’t rely on GPU code for non-GPU runs, but maybe @cms-sw/heterogeneous-l2 can comment.
@clacaputo you are probably right. The best approach would be to split the plugin in two: one working entirely on CPU, and one with the GU code. However, I would suggest to let the authors of the package to take care of this.
Hi @smuzaffar EERecHitGPU.cc , EERecHitGPUtoSoA.cc
seem to be used only in some test
code, so they can be easily skipped. Concerning the other plugins showing the same behaviour (do you have a list?), maybe we could open an issue on ask the main developer to address the splitting suggested by @fwyzard .
@cmsbuild please test for el8_amd64_gcc12
Hi @smuzaffar , I've tried to refresh the test results using el8_amd64_gcc12
, but the test failed with this error:
'Unable to find CMSSW release for CMSSW_12_5_X/el8_amd64_gcc12'
Am I doing something wrong?
@cmsbuild please test for el8_amd64_gcc12
by the way, do we have gcc12 IBs to test this locally ?
@fwyzard as gcc12 IBs are broken so we only build those on demand. I just have started one which should be available for test later in the evening
Thanks - I won't be able to have a look until next week, but I guess the IB should stay around for another 10 days or so.
@smuzaffar
as gcc12 IBs are broken so we only build those on demand. I just have started one which should be available for test later in the evening
Ah... they are very broken; we don't seem to get any CMSSW built at all :-/
It is pretty low on my priority, but, is there a way to see what is failing ? Other than attempting the full build locally, of course.
please test for el8_amd64_gcc12
-1
Failed Tests: Build
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-f03205/27633/summary.html
COMMIT: fa0759e226a44e8490bcba3f08270b19d9c6539a
CMSSW: CMSSW_12_6_X_2022-09-17-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/38396/27633/install.sh
to create a dev area with all the needed externals and cmssw changes.
Build
I found compilation error when building:
/cvmfs/cms-ib.cern.ch/nweek-02750/el8_amd64_gcc12/external/gcc/12.2.0-f8ec77b592790702d83afb7106a458e3/bin/../lib/gcc/x86_64-redhat-linux-gnu/12.2.1/../../../../x86_64-redhat-linux-gnu/bin/ld: tmp/el8_amd64_gcc12/src/RecoLocalCalo/HGCalRecProducers/plugins/RecoLocalCaloHGCalRecProducersPlugins/HEFRecHitGPUtoSoA.cc.o: in function `HEFRecHitGPUtoSoA::acquire(edm::Event const&, edm::EventSetup const&, edm::WaitingTaskWithArenaHolder) [clone .cold]': HEFRecHitGPUtoSoA.cc:(.text.unlikely+0xa3): undefined reference to `KernelManagerHGCalRecHit::~KernelManagerHGCalRecHit()' /cvmfs/cms-ib.cern.ch/nweek-02750/el8_amd64_gcc12/external/gcc/12.2.0-f8ec77b592790702d83afb7106a458e3/bin/../lib/gcc/x86_64-redhat-linux-gnu/12.2.1/../../../../x86_64-redhat-linux-gnu/bin/ld: tmp/el8_amd64_gcc12/src/RecoLocalCalo/HGCalRecProducers/plugins/RecoLocalCaloHGCalRecProducersPlugins/HeterogeneousHGCalHEFCellPositionsConditions.cc.o: in function `HeterogeneousHGCalHEFCellPositionsConditions::getHeterogeneousConditionsESProductAsync(CUstream_st*) const': HeterogeneousHGCalHEFCellPositionsConditions.cc:(.text+0x7e4): undefined reference to `KernelManagerHGCalCellPositions::KernelManagerHGCalCellPositions(unsigned long const&)' /cvmfs/cms-ib.cern.ch/nweek-02750/el8_amd64_gcc12/external/gcc/12.2.0-f8ec77b592790702d83afb7106a458e3/bin/../lib/gcc/x86_64-redhat-linux-gnu/12.2.1/../../../../x86_64-redhat-linux-gnu/bin/ld: HeterogeneousHGCalHEFCellPositionsConditions.cc:(.text+0x7f0): undefined reference to `KernelManagerHGCalCellPositions::fill_positions(hgcal_conditions::HeterogeneousHEFCellPositionsConditionsESProduct const*)' collect2: error: ld returned 1 exit status gmake: *** [tmp/el8_amd64_gcc12/src/RecoLocalCalo/HGCalRecProducers/plugins/RecoLocalCaloHGCalRecProducersPlugins/libRecoLocalCaloHGCalRecProducersPlugins.so] Error 1 Leaving library rule at src/RecoLocalCalo/HGCalRecProducers/plugins >> Compiling /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_12_6_X_2022-09-17-1100/src/RecoLocalCalo/HGCalRecProducers/test/EtaPhiSearchInTile_t.cpp >> Building binary EtaPhiSearchInTileLC Copying tmp/el8_amd64_gcc12/src/RecoLocalCalo/HGCalRecProducers/test/EtaPhiSearchInTileLC/EtaPhiSearchInTileLC to productstore area:
please test for el8_amd64_gcc12
-1
Failed Tests: Build
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-f03205/28231/summary.html
COMMIT: fa0759e226a44e8490bcba3f08270b19d9c6539a
CMSSW: CMSSW_12_6_X_2022-10-12-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/38396/28231/install.sh
to create a dev area with all the needed externals and cmssw changes.
Build
I found compilation error when building:
/cvmfs/cms-ib.cern.ch/nweek-02754/el8_amd64_gcc12/external/gcc/12.2.0-f8ec77b592790702d83afb7106a458e3/bin/../lib/gcc/x86_64-redhat-linux-gnu/12.2.1/../../../../x86_64-redhat-linux-gnu/bin/ld: tmp/el8_amd64_gcc12/src/RecoLocalCalo/HGCalRecProducers/plugins/RecoLocalCaloHGCalRecProducersPlugins/HEFRecHitGPUtoSoA.cc.o: in function `HEFRecHitGPUtoSoA::acquire(edm::Event const&, edm::EventSetup const&, edm::WaitingTaskWithArenaHolder) [clone .cold]': HEFRecHitGPUtoSoA.cc:(.text.unlikely+0xa3): undefined reference to `KernelManagerHGCalRecHit::~KernelManagerHGCalRecHit()' /cvmfs/cms-ib.cern.ch/nweek-02754/el8_amd64_gcc12/external/gcc/12.2.0-f8ec77b592790702d83afb7106a458e3/bin/../lib/gcc/x86_64-redhat-linux-gnu/12.2.1/../../../../x86_64-redhat-linux-gnu/bin/ld: tmp/el8_amd64_gcc12/src/RecoLocalCalo/HGCalRecProducers/plugins/RecoLocalCaloHGCalRecProducersPlugins/HeterogeneousHGCalHEFCellPositionsConditions.cc.o: in function `HeterogeneousHGCalHEFCellPositionsConditions::getHeterogeneousConditionsESProductAsync(CUstream_st*) const': HeterogeneousHGCalHEFCellPositionsConditions.cc:(.text+0x7e4): undefined reference to `KernelManagerHGCalCellPositions::KernelManagerHGCalCellPositions(unsigned long const&)' /cvmfs/cms-ib.cern.ch/nweek-02754/el8_amd64_gcc12/external/gcc/12.2.0-f8ec77b592790702d83afb7106a458e3/bin/../lib/gcc/x86_64-redhat-linux-gnu/12.2.1/../../../../x86_64-redhat-linux-gnu/bin/ld: HeterogeneousHGCalHEFCellPositionsConditions.cc:(.text+0x7f0): undefined reference to `KernelManagerHGCalCellPositions::fill_positions(hgcal_conditions::HeterogeneousHEFCellPositionsConditionsESProduct const*)' collect2: error: ld returned 1 exit status gmake: *** [tmp/el8_amd64_gcc12/src/RecoLocalCalo/HGCalRecProducers/plugins/RecoLocalCaloHGCalRecProducersPlugins/libRecoLocalCaloHGCalRecProducersPlugins.so] Error 1 Leaving library rule at src/RecoLocalCalo/HGCalRecProducers/plugins Entering library rule at RecoLocalCalo/HGCalRecProducers >> Compiling /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_12_6_X_2022-10-12-1100/src/RecoLocalCalo/HGCalRecProducers/src/ComputeClusterTime.cc >> Compiling /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_12_6_X_2022-10-12-1100/src/RecoLocalCalo/HGCalRecProducers/src/HGCalRecHitWorkerFactory.cc
@smuzaffar , do we still need this PR in?
No, not really, closing it