data-curation
data-curation copied to clipboard
CMS - provenance for mcdb cases
Observed when running code/lhe_generators.py
with the updates
xz: File too large
It produces some LOG.txt files with a dataset name path (instead of the usual recid
) to the lhe_generators/2016-sim/gridpacks/
directory:
$ ls lhe_generators/2016-sim/gridpacks/ | tail -15
75600
75601
BcToBuKPi_BuJPsiK_TuneCP5_13TeV-bcvegpy2-pythia8-evtgen
BcToJPsiMuMu_inclusive_TuneCP5_13TeV-bcvegpy2-pythia8-evtgen
BcToJpsPi_TuneCP5_13TeV-bcvegpy2-pythia8-evtgen
BcToPsi2SPi_PJPP_TuneCP5_13TeV-bcvegpy2-pythia8-evtgen
BcToPsi2SPi_PMM_TuneCP5_13TeV-bcvegpy2-pythia8-evtgen
SPS_D0ToKPi_JPsiPt-100To150_TuneCP5_13TeV-helaconia-pythia8-evtgen
SPS_ToY1SZ_Y1SToMuMu_ZToMuMu_TuneCP5_13TeV-helaconia-pythia8
ST_t-channel_eDecays_anomwtbLVRT_RT4_TuneCP5_13TeV-comphep-pythia8
ST_t-channel_tauDecays_anomwtbLVLT_LT_TuneCP5_13TeV-comphep-pythia8
X0ToUpsilonJPsi_M-12p6_TuneCP5_v2_13TeV-JHUGen-pythia8
X0ToUpsilonJPsi_M-12p7_TuneCP5_v2_13TeV-JHUGen-pythia8
X0ToUpsilonJPsi_M-12p9_TuneCP5_v2_13TeV-JHUGen-pythia8
X0ToUpsilonJPsi_M-13p4_TuneCP5_v2_13TeV-JHUGen-pythia8
with this type of output
$ head lhe_generators/2016-sim/gridpacks/BcToBuKPi_BuJPsiK_TuneCP5_13TeV-bcvegpy2-pythia8-evtgen/RunIISummer20UL16NanoAODv9-106X_mcRun2_as
ymptotic_v17-v1/NANOAODSIM/LOG.txt
2024-06-02 00:14:03 | ERROR | Error xz: (stdout): Write error: File too large
xz: (stdout): Write error: File too large
xz: (stdout): Write error: File too large
xz: (stdout): Write error: File too large
LPAIR generator
"/GGToMuMu_Pt-25_Inel-El_13TeV-lpair/RunIISummer20UL16NanoAODv9-106X_mcRun2_asymptotic_v17-v1/NANOAODSIM": 36304, "mcdb_id": 19544
lhe_generators/2016-sim/mcdb/19544_header.txt has
$ head lhe_generators/2016-sim/mcdb/19544_header.txt
<header>
This file was created from the output of the LPAIR generator
</header>
<header>
This file was created from the output of the LPAIR generator
</header>
Init block with no information on the generator
fpmc
"/GGToGG_bSM_A1A_1e-13_A2A_1e-13_Pt-50_13TeV_fpmc/RunIISummer20UL16NanoAODv9-106X_mcRun2_asymptotic_v17-v1/NANOAODSIM": 36290, "mcdb_id": 19101
"/GGToGG_bSM_A1A_1e-14_A2A_1e-14_Pt-50_13TeV_fpmc/RunIISummer20UL16NanoAODv9-106X_mcRun2_asymptotic_v17-v1/NANOAODSIM": 36292, "mcdb_id": 19102
"/GGToGG_bSM_A1A_5e-13_A2A_0_Pt-50_13TeV_fpmc/RunIISummer20UL16NanoAODv9-106X_mcRun2_asymptotic_v17-v1/NANOAODSIM": 36294, "mcdb_id": 19103
"/GGToGG_SM_Pt-50_13TeV_fpmc/RunIISummer20UL16NanoAODv9-106X_mcRun2_asymptotic_v17-v1/NANOAODSIM": 36296, "mcdb_id": 19104
e.g.
$ cat lhe_generators/2016-sim/mcdb/19103_header.txt
<init>
2212 2212 0.65000000E+04 0.65000000E+04 -1 -1 -1 -1 4 1
0.49029854E-01 0.23246991-306 0.49029854E-04 -1
</init>
lpair
"/GGToMuMu_Pt-25_Inel-Inel_13TeV-lpair/RunIISummer20UL16NanoAODv9-106X_mcRun2_asymptotic_v17-v1/NANOAODSIM": 36306, "mcdb_id": 19545
with multiple init bocks
Full event file
File name <mcdb_id>
instead of <mcdb_id>_header.txt
and apparently the full event content:
$ ls -lhS lhe_generators/2016-sim/mcdb
total 49G
-rw-r--r--. 1 kati zh 47G Jun 3 19:05 19658
-rw-r--r--. 1 kati zh 1.1G Jun 2 16:05 19405
-rw-r--r--. 1 kati zh 980M Jun 2 16:04 19412
These are for
33273: /BcToPsi2SPi_PMM_TuneCP5_13TeV-bcvegpy2-pythia8-evtgen/RunIISummer20UL16NanoAODv9-106X_mcRun2_asymptotic_v17-v2/NANOAODSIM
19405: /ST_t-channel_tauDecays_anomwtbLVLT_LT_TuneCP5_13TeV-comphep-pythia8/RunIISummer20UL16NanoAODv9-106X_mcRun2_asymptotic_v17-v1/NANOAODSIM
19412: /ST_t-channel_eDecays_anomwtbLVRT_RT4_TuneCP5_13TeV-comphep-pythia8/RunIISummer20UL16NanoAODv9-106X_mcRun2_asymptotic_v17-v1/NANOAODSIM
The first also for several other datasets.