cmssw icon indicating copy to clipboard operation
cmssw copied to clipboard

Implement ResourceInformationService

Open wddgit opened this issue 2 years ago • 39 comments

PR description:

Initial implementation of ResourceInformationService. The function acceleratorsTypes() will return a container holding enumeration values. Currently it will always be empty or contain a value for "GPU" if any item in "@selected_accelerators" starts with the substring "gpu-". We expect addition possible enumeration values may be added in the future.

The following additional functions are added:

  • cpuModels()
  • gpuModels()
  • nvidiaDriverVersion()
  • cudaDriverVersion()
  • cudaRuntimeVersion()
  • cpuModelsFormatted()
  • cpuAverageSpeed()

The service contains data members to hold the data returned by those functions. Other services (CPU and CUDAService) must be configured for these to be filled. We expect other accelerator devices may be added to this in the future.

This does not include changes for anything to get information out of ResourceInformationService. Those changes will come in a future PR.

This also does not include changes to store this information persistently. That will also come in a future PR.

If the service has its verbose parameter set true, then it will print out some information at begin job.

PR validation:

There is a unit test that checks the information printed out when the service is set to be verbose. Nothing uses this service yet so I do not expect this will have any immediate effect on RelVals or production executables.

wddgit avatar May 05 '22 21:05 wddgit

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37831/29750

  • This PR adds an extra 36KB to repository

cmsbuild avatar May 05 '22 22:05 cmsbuild

A new Pull Request was created by @wddgit (W. David Dagenhart) for master.

It involves the following packages:

  • FWCore/Framework (core)
  • FWCore/Services (core)
  • FWCore/Utilities (core)
  • HeterogeneousCore/CUDAServices (heterogeneous)

@cmsbuild, @smuzaffar, @Dr15Jones, @makortel, @fwyzard can you please review it and eventually sign? Thanks. @makortel, @felicepantaleo, @rovere this is something you requested to watch as well. @perrotta, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

cmsbuild avatar May 05 '22 22:05 cmsbuild

please test

wddgit avatar May 05 '22 22:05 wddgit

@wddgit can we discuss the kind of information that should be gathered ? I think it would be useful to have more details that just the number of accelerators and their models.

For example, some global informations that would be useful to track are:

  • the NVIDIA driver version being used (which depends on the local installation);
  • the CUDA driver version being used (which usually matches the NVIDIA driver version, but it could also be the compatibility library we ship with CMSSW);
  • the CUDA runtime version being used (which should be the version we ship with CMSSW, but better check).

While some additional per-GPU information that could be useful are:

  • if the GPU usage is exclusive or shared with other jobs
  • the amount of total and free GPU memory when the job starts
  • etc.

fwyzard avatar May 05 '22 22:05 fwyzard

-1

Failed Tests: RelVals-INPUT Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-02f619/24488/summary.html COMMIT: b4e04a925d636859f41df021f440746f4ee554a6 CMSSW: CMSSW_12_4_X_2022-05-05-1100/slc7_amd64_gcc10 User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/37831/24488/install.sh to create a dev area with all the needed externals and cmssw changes.

RelVals-INPUT

  • 136.803136.803_RunNoBPTX2017C+RunNoBPTX2017C+HLTDR2_2017+RECODR2_2017reHLTAlCaTkCosmics_Prompt+HARVEST2017/step2_RunNoBPTX2017C+RunNoBPTX2017C+HLTDR2_2017+RECODR2_2017reHLTAlCaTkCosmics_Prompt+HARVEST2017.log
  • 136.8136.8_RunSinglePh2017C+RunSinglePh2017C+HLTDR2_2017+RECODR2_2017reHLT_skimSinglePh_Prompt+HARVEST2017_skimSinglePh/step2_RunSinglePh2017C+RunSinglePh2017C+HLTDR2_2017+RECODR2_2017reHLT_skimSinglePh_Prompt+HARVEST2017_skimSinglePh.log
  • 136.802136.802_RunMuOnia2017C+RunMuOnia2017C+HLTDR2_2017+RECODR2_2017reHLT_skimMuOnia_Prompt+HARVEST2017_skimMuOnia/step2_RunMuOnia2017C+RunMuOnia2017C+HLTDR2_2017+RECODR2_2017reHLT_skimMuOnia_Prompt+HARVEST2017_skimMuOnia.log
Expand to see more relval errors ...
  • 136.801
  • 136.7952
  • 136.804
  • 136.805
  • 136.806
  • 136.807
  • 136.808
  • 136.809
  • 136.81
  • 136.811
  • 136.812
  • 136.813
  • 136.814
  • 136.815
  • 136.816
  • 136.817
  • 136.818
  • 136.819
  • 136.82
  • 136.821
  • 136.822
  • 136.823
  • 136.824
  • 136.825
  • 136.826
  • 136.827
  • 136.828
  • 136.829
  • 136.83
  • 136.831
  • 136.8311
  • 136.83111
  • 136.832
  • 136.833
  • 136.834
  • 136.835
  • 136.836
  • 136.837
  • 136.838
  • 136.839
  • 136.8391
  • 136.84
  • 136.841
  • 136.842
  • 136.843
  • 136.844
  • 136.845
  • 136.846
  • 136.847
  • 136.848
  • 136.849
  • 136.85
  • 136.8501
  • 136.851
  • 136.852
  • 136.8521
  • 136.8522
  • 136.8523
  • 136.853
  • 136.854
  • 136.855
  • 136.856
  • 136.8561
  • 136.8562
  • 136.857
  • 136.858
  • 136.859
  • 136.86
  • 136.861
  • 136.862
  • 136.863
  • 136.864
  • 136.8642
  • 136.865
  • 136.866
  • 136.867
  • 136.868
  • 136.869
  • 136.87
  • 136.871
  • 136.872
  • 136.873
  • 136.874
  • 136.875
  • 136.876
  • 136.877
  • 136.878
  • 136.879
  • 136.88
  • 136.881
  • 136.882
  • 136.883
  • 136.884
  • 136.885
  • 136.8855
  • 136.885501
  • 136.885511
  • 136.885521
  • 136.886
  • 136.8861
  • 136.8862
  • 136.887
  • 136.888
  • 136.88811
  • 136.8885
  • 136.888501
  • 136.888511
  • 136.888521
  • 136.889
  • 136.89
  • 136.891
  • 136.892
  • 136.893
  • 136.894
  • 136.895
  • 136.896
  • 136.897
  • 136.898
  • 136.899
  • 137.8
  • 138.1
  • 138.2
  • 138.3
  • 138.4
  • 138.5
  • 139.001
  • 139.002
  • 139.003
  • 139.004
  • 139.005
  • 140.51
  • 140.52
  • 140.53
  • 140.54
  • 140.55
  • 140.56
  • 140.5611
  • 140.57
  • 158.01
  • 158.1
  • 158.2
  • 158.3
  • 1306.0
  • 1307.0
  • 1308.0
  • 1309.0
  • 1310.0
  • 1311.0
  • 1312.0
  • 1313.0
  • 1314.0
  • 1315.0
  • 1316.0
  • 1317.0
  • 1318.0
  • 1319.0
  • 1320.0
  • 1321.0
  • 1322.0
  • 1323.0
  • 1324.0
  • 1325.0
  • 1325.1
  • 1325.2
  • 1325.3
  • 1325.4
  • 1325.5
  • 1325.51
  • 1325.516
  • 1325.5161
  • 1325.517
  • 1325.518
  • 1325.6
  • 1325.61
  • 1325.7
  • 1325.8
  • 1325.81
  • 1325.9
  • 1325.91
  • 1326.0
  • 1327.0
  • 1328.0
  • 1329.0
  • 1329.1
  • 1330.0
  • 1331.0
  • 1332.0
  • 1333.0
  • 1334.0
  • 1335.0
  • 1336.0
  • 1337.0
  • 1338.0
  • 1339.0
  • 1340.0
  • 1341.0
  • 1343.0
  • 1344.0
  • 1345.0
  • 1347.0
  • 1348.0
  • 1349.0
  • 1350.0
  • 1351.0
  • 1352.0
  • 1353.0
  • 1354.0
  • 1355.0
  • 1364.0
  • 1365.0
  • 1366.0
  • 134.0
  • 134.99601
  • 134.99602
  • 134.99603
  • 134.99901
  • 144.6
  • 139901.0
  • 139902.0
  • 13992501.0
  • 13992502.0
  • 200.0
  • 202.0
  • 203.0
  • 205.0
  • 11024.2
  • 25200.0
  • 25202.0
  • 25202.1
  • 25202.2
  • 25203.0
  • 25204.0
  • 25205.0
  • 25206.0
  • 25207.0
  • 25208.0
  • 25209.0
  • 25214.0
  • 50200.0
  • 50202.0
  • 50203.0
  • 50204.0
  • 50205.0
  • 50206.0
  • 50207.0
  • 50208.0
  • 1000.0
  • 1001.0
  • 1001.2
  • 1002.0
  • 1003.0
  • 1004.0
  • 1010.0
  • 1020.0
  • 1030.0
  • 1040.0
  • 1040.1
  • 1041.0
  • 1042.0
  • 1102.0
  • 4000.0
  • 4001.0
  • 4002.0
  • 4003.0
  • 10001.0
  • 10002.0
  • 10003.0
  • 10004.0
  • 10005.0
  • 10006.0
  • 10007.0
  • 10008.0
  • 10009.0
  • 10023.0
  • 10024.0
  • 10024.1
  • 10024.2
  • 10024.3
  • 10024.4
  • 10024.5
  • 10025.0
  • 10026.0
  • 10042.0
  • 10059.0
  • 10071.0
  • 10224.0
  • 10225.0
  • 10424.0
  • 10801.0
  • 10802.0
  • 10803.0
  • 10804.0
  • 10804.31
  • 10805.0
  • 10805.31
  • 10806.0
  • 10807.0
  • 10808.0
  • 10809.0
  • 10823.0
  • 10824.0
  • 10824.1
  • 10824.5
  • 10824.501
  • 10824.505
  • 10824.511
  • 10824.521
  • 10824.6
  • 10824.8
  • 10825.0
  • 10826.0
  • 10842.0
  • 10842.501
  • 10842.505
  • 10859.0
  • 10871.0
  • 11024.0
  • 11024.6
  • 11025.0
  • 11224.0
  • 11224.6
  • 11601.0
  • 11602.0
  • 11603.0
  • 11604.0
  • 11605.0
  • 11606.0
  • 11607.0
  • 11608.0
  • 11609.0
  • 11630.0
  • 11634.0
  • 11634.1
  • 11634.24
  • 11634.5
  • 11634.501
  • 11634.505
  • 11634.511
  • 11634.521
  • 11634.601
  • 11634.7
  • 11634.91
  • 11640.0
  • 11643.0
  • 11646.0
  • 11650.0
  • 11650.501
  • 11650.505
  • 11723.17
  • 11725.0
  • 11834.0
  • 11834.13
  • 11834.21
  • 11834.24
  • 11834.99
  • 11846.0
  • 11925.0
  • 12034.0
  • 12434.0
  • 12634.0
  • 12634.99
  • 12834.0
  • 13034.0
  • 13034.99
  • 23234.0
  • 23234.21
  • 23434.21
  • 23434.99
  • 23434.9921
  • 23434.999
  • 34634.0
  • 35034.0
  • 39434.0
  • 39434.103
  • 39434.21
  • 39434.5
  • 39434.501
  • 39434.502
  • 39434.75
  • 39434.9
  • 39496.0
  • 39500.0
  • 39634.114
  • 39634.21
  • 39634.99
  • 39634.9921
  • 39634.999
  • 250200.0
  • 250200.17
  • 250200.18
  • 250202.0
  • 250202.1
  • 250202.17
  • 250202.171
  • 250202.172
  • 250202.18
  • 250202.181
  • 250202.2
  • 250202.3
  • 250202.4
  • 250202.5
  • 250203.0
  • 250203.17
  • 250203.18
  • 250204.0
  • 250204.17
  • 250204.18
  • 250205.0
  • 250205.17
  • 250205.18
  • 250206.0
  • 250206.17
  • 250206.18
  • 250206.181
  • 250207.0
  • 250207.17
  • 250207.18
  • 250208.17
  • 250208.18
  • 500200.0
  • 500202.0
  • 500203.0
  • 500204.0
  • 500205.0
  • 500206.0
  • 500207.0

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 8 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3700704
  • DQMHistoTests: Total failures: 19
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 3700662
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.004 KiB( 48 files compared)
  • DQMHistoSizes: changed ( 312.0 ): 0.004 KiB MessageLogger/Warnings
  • Checked 206 log files, 45 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

cmsbuild avatar May 06 '22 02:05 cmsbuild

Yes, this is definitely open for discussion. One reason I submitted this PR, which only partially resolves this issue, is that I wanted more discussion to make sure I was headed in the right direction before I put in more time. Matti is the expert on this directing my work. My experience with GPU issues is very small. I am just starting up that learning curve.

FYI. I have a week of vacation scheduled next week. Feel free to continue discussions in my absence and I'll continue working on this when I return.

wddgit avatar May 06 '22 14:05 wddgit

please test

The test errors look unrelated to this PR. Try running the tests again.

wddgit avatar May 06 '22 15:05 wddgit

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-02f619/24499/summary.html COMMIT: b4e04a925d636859f41df021f440746f4ee554a6 CMSSW: CMSSW_12_4_X_2022-05-06-1100/slc7_amd64_gcc10 User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/37831/24499/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3700704
  • DQMHistoTests: Total failures: 2
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3700680
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 48 files compared)
  • Checked 206 log files, 45 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

cmsbuild avatar May 06 '22 19:05 cmsbuild

Thanks David.

I think it would be better to distinguish the CPU and the GPU models in this Service.. I think all the considered consumers of this information (JobReport, CondorStatusService, "provenance") would want to report those separately (i.e. CPU model is this, and GPU model is that).

One feature in the consumers that I didn't consider before is that they all seem to need different level of information. E.g. for CPU

  • for "provenance" in the ROOT file I'd imagine only the CPU model to be relevant
  • CondorStatusService reports "average speed" in addition to the model
  • JobReport adds even more information

One way would be to evolve the ResourceInformationService towards a key-value store, into which the CPU Service, CUDAService, etc, can push information, and the consumers would use what they consider relevant. That would require some level of standardization of the keys between the producers and consumers, at least for the consumers that want only a (small) subset of the information (e.g. JobReport could just dump everything).

Or maybe a 2-level hierarchical key-value store? E.g. expressed as a JSON something along

[
  {
    "Type" : "CPU",
    "Model" :  "Intel ...",
    ...
  },
  {
    "Type" : "GPU",
    "Model" : "NVIDIA ...",
    ...
  },
  {
    "Type" : "GPU",
    "Model" : "NVIDIA something else...",
    ...
    }
]

?

@fwyzard We can certainly add more information to be delivered around, But we should also have some understanding where that information would be consumed. E.g. for the consumers above, I'd think as an overall guideline

  • file "provenance" would be limited to information that can affect physics results
  • CondorStatusService would be limited to information useful for monitoring currently/recently running grid jobs
  • Framework job report could contain almost anything (as it is quite large already)

Do you have any other consumer in mind for this kind of information?

makortel avatar May 07 '22 01:05 makortel

Hi Matti, the various driver versions can affect the physics results, due to bug fixes within them, and the possible use of runtime version checks to enable or disable optimisations. (e.g.: the 510.xx driver series fixes a bug in cooperative groups)

So, this could be similar to the impact of the glibc version (do we store and costume that anywhere?).

Information about available memory and exclusive use should not affect the physics output, but could be useful for debugging problems based on the reports.

Other details like core counts, total memory, clock speed, etc. could be useful mostly for monitoring, and maybe for scaling the reconstruction time.

fwyzard avatar May 07 '22 07:05 fwyzard

Thanks @fwyzard.

So, this could be similar to the impact of the glibc version (do we store and costume that anywhere?).

I don't think we store glibc version explicitly anywhere, but to large degree that is governed by the production SCRAM_ARCH of a given release (and if one uses non-production arch, one is expected to know what one is doing).

the various driver versions can affect the physics results, due to bug fixes within them, and the possible use of runtime version checks to enable or disable optimisations.

Including the driver version in the "file provenance" makes sense (it can be expected to vary a lot more than e.g. the actual glibc binary). Maybe the driver version could be generic-enough between vendors that we could call the field just along "GPU driver version" without explicitly specifying CUDA/ROCm/etc, since the vendor should be clear from the model name record.

Information about available memory and exclusive use should not affect the physics output, but could be useful for debugging problems based on the reports.

Would the GPU model name be sufficient towards available memory, at least for the "file provenance"?

How would we know if a process has an exclusive use to a GPU? (without constantly/periodically monitoring possible other processes accessing the GPU, in which case we would know it too late for the "file provenance")

makortel avatar May 13 '22 19:05 makortel

Would the GPU model name be sufficient towards available memory, at least for the "file provenance"?

I'm not 100% sure: there are some GPU models that come in different variants with different amount of memory, and I don't know if that is always part of the "model name".

Moreover, the free memory is likely more important than the total memory anyway.

How would we know if a process has an exclusive use to a GPU? (without constantly/periodically monitoring possible other processes accessing the GPU, in which case we would know it too late for the "file provenance")

Whether the GPU is in "exclusive mode" or "shared mode" is reported in the computeMode field of the cudaDeviceProp structure reported by cudaGetDeviceProperties(...).

If the GPU is in exclusive mode we know no other process can use it.

If the GPU is in shared mode we don't know if any other process is using it or not (without "looking" with something like nvidia-smi or an equivalente API, I guess).

fwyzard avatar May 13 '22 21:05 fwyzard

I was looking at this again this afternoon. There is one general question that bothers me a little. There is some discussion that "things that effect physics" need to be stored. My understanding is that our goal is that none of these things effect physics, beyond maybe some precision and rounding issues which might very rarely include some significant difference when some quantity lies very near the boundary of a cutoff. None of these things should effect physics unless there is a bug or unexpected problem somewhere.... Is what we really mean "things likely to have problems that would effect physics"?

One other comment. If we include things like free memory available in what is persistently stored, then every object will be a little different. Our initial idea that we might have many objects exactly the same when a large number of files were merged would no longer be valid. It makes the idea of using the ParameterSetID registry to store these things less attractive ... Just thinking out loud about one of the possibilities...

wddgit avatar May 19 '22 21:05 wddgit

There is some discussion that "things that effect physics" need to be stored. My understanding is that our goal is that none of these things effect physics, beyond maybe some precision and rounding issues which might very rarely include some significant difference when some quantity lies very near the boundary of a cutoff. None of these things should effect physics unless there is a bug or unexpected problem somewhere.... Is what we really mean "things likely to have problems that would effect physics"?

I think you're on the right track. Ideally none of these should not affect physics. But we know that (already in absence of bugs) with some algorithms e.g. SSE/AVX vs. AVX2 can make a numerical difference (because of FMA), and CPU vs GPU can make a numerical difference. So one part is to give that information for the case one is wondering e.g. why the same event in different Primary Datasets was not reconstructed bitwise identical (which has happened).

The other part is then showing information for investigating actual problems. In principle the (machine) code for different vectorization levels, and in particular between CPU and GPU is different, and it would be good to record that. (applies also to the GPU driver version, especially if we start to run different versions of algorithms based on driver version, see https://github.com/cms-sw/cmssw/pull/35713).

If we include things like free memory available in what is persistently stored, then every object will be a little different. Our initial idea that we might have many objects exactly the same when a large number of files were merged would no longer be valid.

Thanks for pointing out a concern on the storage. I think concerns of (available) memory should be monitored in some other way.

Anyway, I think we should start with something reasonable, as we can always adjust the set of information when we gain more experience.

makortel avatar May 20 '22 00:05 makortel

If the goal of the information collected by the ResourceInformationService is to be easily mergeable across different jobs running in a similar environments, meyb more extended information could be stored in a (per-job ?) data product ?

fwyzard avatar May 21 '22 06:05 fwyzard

If the goal of the information collected by the ResourceInformationService is to be easily mergeable across different jobs running in a similar environments, meyb more extended information could be stored in a (per-job ?) data product ?

The goal of the ResourceInformationService is to unify how the information on compute resources are passed around to various monitoring things (like framework job report, CondorStatusService, the upcoming "accelerator provenance").

This goal is broader than the "accelerator provenance" that would be stored and managed by the framework. The ResourceInformationService itself can (and likely will) be made to handle more information than what the "accelerator provenance" needs.

Focusing then on the "accelerator provenance", we actually wanted to use the ProcessBlock (which is the closest thing to per-job data product) to store the information. I believe this approach would have worked fine for offline use, where each job processes full LuminosityBlocks. But we were concerned of the cost of doing that at the HLT, where (large) N jobs process the data for one LuminosityBlock, followed by merges (in addition of having to extend the streamer file format). Specifically, because ProcessBlock does not support merging products (for reasons), we would have had to store N copies/versions of the products with indexing to map Events to the products (ProcessBlocks). Instead, we wanted to minimize the overhead for the information framework itself would store.

makortel avatar May 23 '22 20:05 makortel

But we were concerned of the cost of doing that at the HLT, where (large) N jobs process the data for one LuminosityBlock, followed by merges (in addition of having to extend the streamer file format). Specifically, because ProcessBlock does not support merging products (for reasons), we would have had to store N copies/versions of the products with indexing to map Events to the products (ProcessBlocks).

Without entering into the details of why ProcessBlock does not support merging products in general, would it maybe be possible to support "merging" in the special case of streamer files with identical products ?

fwyzard avatar May 23 '22 21:05 fwyzard

(I thought I had sent this kind of reply already, but apparently it got lost somewhere)

Without entering into the details of why ProcessBlock does not support merging products in general, would it maybe be possible to support "merging" in the special case of streamer files with identical products ?

Thanks for reminding that the streamer files are merged in a way outside of cmsRun. That might make it feasible to use ProcessBlock, and we'll take another look at that. Our of curiosity, are thinking this "accelerator provenance" specifically, or storing other data that is also constant throughout the HLT farm (like in your https://github.com/cms-sw/cmssw/issues/30044#issuecomment-1018730886)?

makortel avatar May 31 '22 18:05 makortel

Is there a .cc file that contains main() where I can start to look at this other executable that merges streamer files? Is there any documentation anywhere related to it that I can read?

wddgit avatar May 31 '22 19:05 wddgit

As far as I know, streamer files are simply concatenated with cat (or something equivalent), following an initialisation file (which should itself be identical for all the files produced by the HLT job for a given run). @smorovic can give you more details.

fwyzard avatar May 31 '22 19:05 fwyzard

For example, an HLT job may produce

run350955/run350955_ls0000_streamPhysics_pid1341573.ini
run350955/run350955_ls1081_streamPhysics_pid1341573.dat
run350955/run350955_ls1081_streamPhysics_pid1341573.jsn
run350955/run350955_ls1082_streamPhysics_pid1341573.dat
run350955/run350955_ls1082_streamPhysics_pid1341573.jsn
...

The first one is the initialisation file, which should be identical for all HLT jobs for this run.

The *.dat files are the streamer files.

The *.jsn files contain the metadata for the *.dat files: how many input events have been processed, how many events were selected and stored in the corresponding .dat file, etc..

As far as I know, the merging steps will hierarchically merge the streamer files produced by the many HLT jobs for a given lumisection, and append the result to a copy of the initialisation file. The result is read by the repacking job and converted to EDM .root format.

fwyzard avatar May 31 '22 19:05 fwyzard

Hi, there isn't much (or any) documentation for this format, as far as I know.

@fwyzard already explained the concept and how they are merged. Trivial concatenation is the main reason we use them. This is the source .cc file where both INI files (registry serialization) and each even are serialized and written out into a buffer (before being written into INI or DAT file) https://github.com/cms-sw/cmssw/blob/master/IOPool/Streamer/src/StreamSerializer.cc

note that we take only one copy of INI (they are identical across processes). Then the merging step above is done for a separate set of DAT files created in each lumisection.

Possibly ProcessBlock could be added once per lumisection by each process to each file (assuming it's small)? Then the source in repacking would, I guess, be responsible for deduplication (assuming, maybe naively, that they shouldn't be repeated each LS and process already caches them for output to ROOT).

json files aren't related to the CMSSW/ROOT content, they are only used for event accounting in DAQ.

smorovic avatar May 31 '22 20:05 smorovic

I spent yesterday afternoon and this morning looking at this PR and I am not sure how to proceed with this. But here are some comments related to the streamer format based on comments received here and in issue 30044:

In HLT, several processes run on different machines (could be different hardware, chips). Each produces a streamer format without a file header, just events. Different events processed on different machines. Later in a separate process, these streamer files get merged by simple concatenation and a single header is added. There is currently no provision for complex logic here to do things other than simple concatenation of events and appending that to a header. Unless we dramatically change how this works, the only way to store information about the machine that processes an individual event is in the event. I see no way to store this in ProcessBlock. The only place to put the information is in the events. Unless there is special code to merge the event information, I cannot even see how create a ProcessBlock out of the information. And this special code would have to run as part of the code that writes the file header... Not sure how to do that. This is far outside of the context of the current ProcessBlock code we have now. I'm not saying it cannot be done and that I cannot do it, but it is outside of the code I usually work on and outside of my expertise and outside code Core usually controls... I suppose the process that creates the file header could read the events, merge the relevant information in some way and create a block in the header. Or the process that converts from streamer format to edm format could do the same thing and then really create a ProcessBlock object (although I think currently this is a simple merge process)...

There are comments in issue #30044 back in January. From Emilio, "Then the only thing that makes sense is to have this information as an event" https://github.com/cms-sw/cmssw/issues/30044#issuecomment-1017344193

Also from issue #30044 back in January from Andrea: "Yes, the idea would be to have this effectively per-event, so we can check offline if there are problems related to accelerator-specific implementations."https://github.com/cms-sw/cmssw/issues/30044#issuecomment-1017481902

So I am lost in trying to understand how to make a Streamer ProcessBlock for this and what information we would put in there.

My first thought is that I should put this PR on hold and focus on finishing the unrelated run concurrency PR I have been working on while Matti is away and attack this problem again in July. The content of that PR is complex so that every time I switch back and forth from it to other things I spend a lot of time remembering what I was doing and getting my head back into the complexities of those code changes.

wddgit avatar Jun 01 '22 17:06 wddgit

The discussion was complicated by us diverging into the file storage, of which this PR is completely independent. I suggest we move further discussion on that (or on the information to actually store) back to #30044 (@wddgit, could you repost your comment there?).

makortel avatar Jun 01 '22 18:06 makortel

Sure. That is probably a better a place to post this. I'll repost it.

wddgit avatar Jun 01 '22 18:06 wddgit

@wddgit and I had a chat and decided to proceed as follows with this PR

  • Include only the information information intended for CondorStatusService and the "architecture provenance" in ResourceInformationService (for now at least)
    • JobReport includes a lot more information, and currently the CPU information is added at endJob, while CondorStatusService and "architecture provenance" would use the information in the beginning of the job. Storing all the information in memory throughout the job sounded something we preferred to avoid.
    • JobReport and CPU service already have an acceptable dependence relation (CPU service uses JobReport, same model scales to arbitrary number of architecture-specific services)
  • Do not generalize ResourceInformationService to a key-value store, but have the information keys explicitly spelled out in method names (for now at least)
    • Amount of stored information is expected to stay small, and this way we can utilize compiler to check the consistency of "keys"
  • Include (for now) the NVIDIA/CUDA driver and CUDA runtime versions @fwyzard suggested in https://github.com/cms-sw/cmssw/pull/37831#issuecomment-1119095848. More information can be added later.

makortel avatar Jun 02 '22 00:06 makortel

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37831/30392

  • This PR adds an extra 44KB to repository

  • There are other open Pull requests which might conflict with changes you have proposed:

    • File HeterogeneousCore/CUDAServices/src/CUDAService.cc modified in PR(s): #37952

cmsbuild avatar Jun 03 '22 22:06 cmsbuild

Pull request #37831 was updated. @cmsbuild, @smuzaffar, @Dr15Jones, @makortel, @fwyzard can you please check and sign again.

cmsbuild avatar Jun 03 '22 22:06 cmsbuild

please test

wddgit avatar Jun 03 '22 22:06 wddgit

-1

Failed Tests: UnitTests Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-02f619/25267/summary.html COMMIT: 3c78a7b5c99f09f959ab378890d63758ceceb84a CMSSW: CMSSW_12_5_X_2022-06-03-1100/el8_amd64_gcc10 User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/37831/25267/install.sh to create a dev area with all the needed externals and cmssw changes.

Unit Tests

I found errors in the following unit tests:

---> test TestFWCoreFrameworkDeleteEarly had ERRORS
---> test testFWCoreFrameworkModuleDeletion had ERRORS

Comparison Summary

There are some workflows for which there are errors in the baseline: 250400.18 step 1 The results for the comparisons for these workflows could be incomplete This means most likely that the IB is having errors in the relvals.The error does NOT come from this pull request

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 3651513
  • DQMHistoTests: Total failures: 8
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3651483
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
  • Checked 208 log files, 45 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

cmsbuild avatar Jun 04 '22 03:06 cmsbuild