opendata.cern.ch icon indicating copy to clipboard operation
opendata.cern.ch copied to clipboard

Create record for derived data for 2011 jet validation step

Open katilp opened this issue 9 years ago • 13 comments

FIles from /afs/cern.ch/user/m/mhaapale/work/public/Jet-Tuples-Summer2016 need to uploaded to a record (similar to http://opendata.cern.ch/record/230) @tamshai can provide the metadata details. It will refer to the new sw records (in #1191)

katilp avatar Oct 07 '16 14:10 katilp

@tiborsimko Could you kindly move these files to eospublic so that Freya can check the code: /afs/cern.ch/user/m/mhaapale/work/public/Jet-Tuples-Summer2016

The parent dataset for data is: /Jet/Run2011A-12Oct2013-v1/AOD For MC, as indicated in https://github.com/cms-opendata-validation/2011-jet-inclusivecrosssection-ntupleproduction/blob/master/tuple_info_mc

katilp avatar Apr 23 '18 19:04 katilp

So for data, they could be under https://eospublichttp01.cern.ch/eos/opendata/cms/Run2011A/Jet What kind of naming we had for ther derived datasets? I see that for evt display file they are under IG and in https://eospublichttp01.cern.ch/eos/opendata/cms/Run2011A/DoubleElectron we have PATtuples, so their directory could be call jettuples (output of http://opendata.cern.ch/record/5104)

katilp avatar Apr 23 '18 19:04 katilp

@katilp I get permission denied:

$ ls -l /afs/cern.ch/user/m/mhaapale/work/public/Jet-Tuples-Summer2016/
ls: cannot access /afs/cern.ch/user/m/mhaapale/work/public/Jet-Tuples-Summer2016/: Permission denied

Could be due to CMS-only access, I guess.

Could you move the files to some fully public AFS space?

we have PATtuples, so their directory could be call jettuples

PAT was referring to Physics Analysis Toolkit. Could be used here too perhaps? (I haven't seen the what the files are...)

tiborsimko avatar Apr 24 '18 12:04 tiborsimko

@tiborsimko OK, it is the same for me, and I'm following with the afs support, I've added you in cc, it should be possible to copy them still. For the naming, PATtuples is no good here as it is a specific well-defined format while this are plain root tuples containing jets. But it is the same format as in use in the CMS jet group so I would go for jettuples.

katilp avatar Apr 24 '18 12:04 katilp

@tiborsimko The files can now be uploaded (https://cern.service-now.com/service-portal/view-request.do?n=RQF1000758)

katilp avatar May 09 '18 09:05 katilp

@katilp Thanks, I have the files:

8302268834 Jet-Tuples-Summer2016/data/OpenDataTuple-Data-Jet-Run2011A-npv.root
8281446820 Jet-Tuples-Summer2016/data/OpenDataTuple-Data-Jet-Run2011A.root
2491180282 Jet-Tuples-Summer2016/MC/OpenDataTuple-MC-QCD_Pt-15to1000_TuneZ2_7TeV_pythia6.root

For Data, we discussed storing them under /eos/opendata/cms/Run2011A/Jet/jettuples.

For MC, since there are multiple parents, where do we store them? E.g. (1) we could create a new directory /eos/opendata/cms/MonteCarlo2011/Summer11LegDR/QCD_Pt-15to1000_TuneZ2_7TeV_pythia6 and store them under jettuples there? Or (2) we could simply store both Data and MC in the same place? Depending on how many bibliographic records shall we create for jet tuples perhaps...

tiborsimko avatar May 18 '18 10:05 tiborsimko

@tiborsimko Thanks! It would make sense to create a new directory for the MC as you suggest i.e. (1) This is somewhat different from the Higgs to 4l root files record 5501 (root files from data and MC in the same record) as that record contains root files with histograms only, while these are root files with event-by-event information.

katilp avatar May 18 '18 10:05 katilp

Done, everything is copied:

  • /eos/opendata/cms/Run2011A/Jet/jettuples/OpenDataTuple-Data-Jet-Run2011A-npv.root
  • /eos/opendata/cms/Run2011A/Jet/jettuples/OpenDataTuple-Data-Jet-Run2011A.root
  • /eos/opendata/cms/MonteCarlo2011/Summer11LegDR/QCD_Pt-15to1000_TuneZ2_7TeV_pythia6/jettuples/OpenDataTuple-MC-QCD_Pt-15to1000_TuneZ2_7TeV_pythia6.root

Please let me know if you'd like to change the names in some way. (currently "OpenDataTuple-...")

We can proceed with the record creation. CC @ArtemisLav

tiborsimko avatar May 18 '18 14:05 tiborsimko

@katilp could you please provide the metadata for this record?

ArtemisLav avatar May 23 '18 13:05 ArtemisLav

@ArtemisLav we will still need to check that they correspond to the code record on the portal with which they can be produced. The metadata is available in

  • https://github.com/cms-opendata-validation/2011-jet-inclusivecrosssection-ntupleproduction/blob/master/tuple_info_data for data
  • https://github.com/cms-opendata-validation/2011-jet-inclusivecrosssection-ntupleproduction/blob/master/tuple_info_data for MC

katilp avatar May 23 '18 19:05 katilp

@katilp The metadata for the data record only mentions one file with 25607902 events, but in EOS we have 2 (OpenDataTuple-Data-Jet-Run2011A.root & OpenDataTuple-Data-Jet-Run2011A-npv.root). Should both go on this record?

Also, we are missing authors for the metadata. Is it the collaboration or specific people?

You can check out the commit below for what I have so far.

ArtemisLav avatar May 24 '18 10:05 ArtemisLav

@ArtemisLav Thanks, I noticed that you haven't done the PR yet, the commit is only in your personal repository. Can you please make a PR and I'll amend as appropriate.

Note to self:

  • 10.7483/OPENDATA.CMS.IIIF.M653
  • 10.7483/OPENDATA.CMS.WE44.2399

tiborsimko avatar Jul 15 '19 10:07 tiborsimko

@tiborsimko PR done. It is a pretty old commit, so it will most likely require a lot of editing.

ArtemisLav avatar Aug 01 '19 10:08 ArtemisLav