cmsdist icon indicating copy to clipboard operation
cmsdist copied to clipboard

Update tools for ROCm 5.6.1 [14.0.x]

Open fwyzard opened this issue 10 months ago • 14 comments

Add amd-smi and ROCProfiler binaries and libraries.

fwyzard avatar Apr 17 '24 20:04 fwyzard

please test

fwyzard avatar Apr 17 '24 20:04 fwyzard

A new Pull Request was created by @fwyzard for branch IB/CMSSW_14_0_X/master.

@iarspider, @aandvalenzuela, @smuzaffar can you please review it and eventually sign? Thanks. @antoniovilela, @rappoccio, @sextonkennedy you are the release manager for this. cms-bot commands are listed here

cmsbuild avatar Apr 17 '24 20:04 cmsbuild

cms-bot internal usage

cmsbuild avatar Apr 17 '24 20:04 cmsbuild

This is a partial backport of #9143, adding the same new tools but without updating the version of ROCm.

fwyzard avatar Apr 17 '24 20:04 fwyzard

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-549774/38909/summary.html COMMIT: 90ed9c67b35b88608a74e364f5b75544be021991 CMSSW: CMSSW_14_0_X_2024-04-17-1100/el8_amd64_gcc12 User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9144/38909/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

cmsbuild avatar Apr 18 '24 09:04 cmsbuild

please test

fwyzard avatar Apr 18 '24 17:04 fwyzard

Pull request #9144 was updated.

cmsbuild avatar Apr 18 '24 17:04 cmsbuild

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-549774/38941/summary.html COMMIT: 3cfcc7c0d7b0dc70976529a3c90dabde8f098f01 CMSSW: CMSSW_14_0_X_2024-04-18-1100/el8_amd64_gcc12 User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9144/38941/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

  • @mmusich cms-sw/cmsdist#9140

You can see more details here: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-549774/38941/git-recent-commits.json https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-549774/38941/git-merge-result

Comparison Summary

Summary:

  • You potentially removed 93 lines from the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 3138 differences found in the comparisons
  • DQMHistoTests: Total files compared: 48
  • DQMHistoTests: Total histograms compared: 3318384
  • DQMHistoTests: Total failures: 206
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3318158
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 287.00399999999996 KiB( 47 files compared)
  • DQMHistoSizes: changed ( 10224.0,... ): -2.742 KiB Physics/Top
  • DQMHistoSizes: changed ( 23234.0,... ): 3.979 KiB HGCalHitCalibrationHLT/hgcal_photon_EoP_CPene_scint_calib_fraction
  • DQMHistoSizes: changed ( 23234.0,... ): 3.977 KiB HGCalHitCalibrationHLT/hgcal_photon_EoP_CPene_100_calib_fraction
  • DQMHistoSizes: changed ( 23234.0,... ): 3.977 KiB HGCalHitCalibrationHLT/hgcal_photon_EoP_CPene_200_calib_fraction
  • DQMHistoSizes: changed ( 23234.0,... ): 3.977 KiB HGCalHitCalibrationHLT/hgcal_photon_EoP_CPene_300_calib_fraction
  • DQMHistoSizes: changed ( 23234.0,... ): 3.976 KiB HGCalHitCalibrationHLT/hgcal_ele_EoP_CPene_scint_calib_fraction
  • DQMHistoSizes: changed ( 23234.0,... ): 3.974 KiB HGCalHitCalibrationHLT/hgcal_ele_EoP_CPene_100_calib_fraction
  • DQMHistoSizes: changed ( 23234.0,... ): 3.974 KiB HGCalHitCalibrationHLT/hgcal_ele_EoP_CPene_200_calib_fraction
  • DQMHistoSizes: changed ( 23234.0,... ): 3.974 KiB HGCalHitCalibrationHLT/hgcal_ele_EoP_CPene_300_calib_fraction
  • DQMHistoSizes: changed ( 23234.0,... ): 3.972 KiB HGCalHitCalibrationHLT/hgcal_EoP_CPene_scint_calib_fraction
  • DQMHistoSizes: changed ( 23234.0 ): ...
  • Checked 202 log files, 165 edm output root files, 48 DQM output files
  • TriggerResults: no differences found

cmsbuild avatar Apr 19 '24 07:04 cmsbuild

+externals

smuzaffar avatar Apr 19 '24 08:04 smuzaffar

This pull request is fully signed and it will be integrated in one of the next IB/CMSSW_14_0_X/master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @sextonkennedy, @rappoccio, @antoniovilela (and backports should be raised in the release meeting by the corresponding L2)

cmsbuild avatar Apr 19 '24 08:04 cmsbuild

@cms-sw/orp-l2 could you merge this for the next 14.0.x release ?

Hopefully amd-smi can provide the same information as NVML for AMD GPUs.

fwyzard avatar Apr 22 '24 17:04 fwyzard

hold

fwyzard avatar Apr 24 '24 11:04 fwyzard

given the issues with rocprofiler in ROCm 6.1, let's wait on this

fwyzard avatar Apr 24 '24 11:04 fwyzard

Pull request has been put on hold by @fwyzard They need to issue an unhold command to remove the hold state or L1 can unhold it for all

cmsbuild avatar Apr 24 '24 11:04 cmsbuild