madanalysis5
madanalysis5 copied to clipboard
Jet Substructure
Context
This branch is dedicated to implementing substructure tools to the MadAnalysis framework.
Description of the change and benefits
Interface updates
- MultiJet clustering :
madanalysis/jet_clustering/.
jet_algorithm
: This module is separate from the original jet clustering interface within Ma5.
This module can be activated using the following command
ma5> define jet_algorithm my_jet antikt radius=0.5
where my_jet
is a user-defined jet identifier, antikt
is the algorithm to be used which
can be chosen from antikt
, cambridge
, genkt
, kt
, gridjet
, cdfjetclu
, cdfmidpoint
,
and siscone
. The rest of the arguments are optional if the user won't define the radius, ptmin etc.
default parameters will be chosen. Each algorithm has its own unique set of parameters i.e.
Algorithm | Parameters & Default values |
---|---|
antikt , cambridge |
radius=0.4 , ptmin=5. |
genkt |
radius=0.4 , ptmin=5. , exclusive=False , p=-1 |
kt |
radius=0.4 , ptmin=5. , exclusive=False |
gridjet |
ymax=3. , ptmin=5. |
cdfjetclu |
radius=0.4 , ptmin=5. , overlap=0.5 , seed=1. , iratch=0. |
cdfmidpoint |
radius=0.4 , ptmin=5. , overlap=0.5 , seed=1. , iratch=0. , areafraction=1. |
siscone |
radius=0.4 , ptmin=5. , overlap=0.5 , input_ptmin=5. , npassmax=1. |
VariableR |
rho=2000. , minR=0. , maxR=2. , ptmin=20. exclusive=False clustertype=CALIKE strategy=Best |
It is also possible to modify the entry after defining it
ma5> define jet_algorithm my_jet cambridge
ma5> set my_jet.ptmin = 200.
ma5> set my_jet.radius = 0.8
Note that when a jet_algorithm
is defined MadAnalysis interface will automatically switch to constituent smearing mode. set my_jet.+<tab>
will show the dedicated options available for that particular algorithm.
It is possible to display all the jets available in the current session by using display jet_algorithm
command:
$ ./bin/ma5 -R
ma5>set main.fastsim.package = fastjet
ma5>define jet_algorithm my_jet cdfmidpoint
ma5>display jet_algorithm
MA5: * Primary Jet Definition :
MA5: fast-simulation package : fastjet
MA5: clustering algorithm : antikt
MA5: + Jet ID : Ma5Jet
MA5: + cone radius = 0.4
MA5: + PT min (GeV) for produced jets = 5.0
MA5: + exclusive identification = true
MA5: + b-jet identification:
MA5: + DeltaR matching = 0.5
MA5: + exclusive algo = true
MA5: + id efficiency = 1.0
MA5: + mis-id efficiency (c-quark) = 0.0
MA5: + mis-id efficiency (light quarks) = 0.0
MA5: + hadronic-tau identification:
MA5: + id efficiency = 1.0
MA5: + mis-id efficiency (light quarks) = 0.0
MA5: --------------------
MA5: * Other Jet Definitions:
MA5: 1. Jet ID = my_jet
MA5: - algorithm : cdfmidpoint
MA5: - radius : 0.4
MA5: - ptmin : 5.0
MA5: - overlap : 0.5
MA5: - seed : 1.0
MA5: - iratch : 0.0
MA5: - areafraction : 1.0
Here primary jet is defined with the original jet definition syntax of MadAnalysis 5 where since we did not specify anything, it uses default antikt
configuration. For more info on how to define primary jet see arXiv:2006.09387. Other jet definitions show all the jets which are defined via jet_algorithm
keyword.
To remove a jet_algorithm
definition one can use remove my_jet
command. Note that one can also change the name of the primary jet which is Ma5Jet
by default.
ma5>set main.fastsim.JetID = my_primary_jet
Link to SFS: There can only be one jet smearing/tagging definition where in case of existing multi-jet definitions smearing will be applied in constituent level which is used by all jets defined in the framework. Jet tagging is only available for the primary jet.
Updates in expert mode structure
Expert mode is designed to automatically realize and construct the MadAnalysis framework according to the given .ma5
command script. This command script can include all the SFS construction mentioned in arXiv:2006.09387 and it can also include multijet definitions
sfs_card_cms_exo_xx_yy.ma5
:
set main.fastsim.package = fastjet
# Define multijet
define jet_algorithm AK08 antikt
set AK08.radius = 0.8
set AK08.ptmin = 200
define jet_algorithm CA15 cambridge radius=1.5 ptmin=200.0
# Define Jet reconstruction efficiencies
define reco_efficiency j 0.925 [abseta <= 1.5]
define reco_efficiency j 0.875 [abseta > 1.5 and abseta <= 2.5]
define reco_efficiency j 0.80 [abseta > 2.5]
# Define Jet smearer
define smearer j with PT sqrt(0.06^2 + pt^2*1.3e-3^2) [abseta <= 0.5 and pt > 0.1]
define smearer j with PT sqrt(0.10^2 + pt^2*1.7e-3^2) [abseta > 0.5 and abseta <= 1.5 and pt > 0.1]
define smearer j with PT sqrt(0.25^2 + pt^2*3.1e-3^2) [abseta > 1.5 and abseta <= 2.5 and pt > 0.1]
# Define B-tagging
define tagger j as b 0.01+0.000038*pt
define tagger c as b 0.25*tanh(0.018*pt)/(1.0+ 0.0013*pt) [abseta < 2.5]
define tagger c as b 0.0 [abseta >=2.5]
define tagger b as b 0.85*tanh(0.0025*pt)*(25.0/(1+0.063*pt)) [abseta < 2.5]
define tagger b as b 0.0 [abseta >= 2.5]
sfs_card_cms_exo_xx_yy.ma5
shows a simple example of multi-jet clustering and detector simulation implementation. First, it chooses the FastJet package as fastsim
interpreter, then defines multijet and in the following defines a simple detector simulation. This file can be executed as follows;
$ ./bin/ma5 -Re cms_exo_xx_yy cms_exo_xx_yy sfs_card_cms_exo_xx_yy.ma5
here cms_exo_xx_yy
is a given analysis and sfs_card_cms_exo_xx_yy.ma5
holds the information for the detector simulation (PAD requires sfs_card_cms_exo_xx_yy.ma5
to setup detector simulation for the analysis code, it automatically writes the detector simulation according to the given sfs_card
file. Note that if there is a card with the same name for multiple analysis files, those analyses can be executed at the same time, hence allowing more efficient execution.). This command will write a folder called cms_exo_xx_yy
including all MadAnalysis 5 framework within.
- Multijet definitions: Multijets are automatically defined in
cms_exo_xx_yy/Build/Main/main.cpp
(do not change) as follows;
//Adding new jet with ID AK08
std::map<std::string, std::string> JetConfiguration1;
JetConfiguration1["JetID" ] = "AK08";
JetConfiguration1["algorithm" ] = "antikt";
JetConfiguration1["cluster.R" ] = "0.8";
JetConfiguration1["cluster.PTmin" ] = "200.0";
cluster1->LoadJetConfiguration(JetConfiguration1);
//Adding new jet with ID CA15
std::map<std::string, std::string> JetConfiguration2;
JetConfiguration2["JetID" ] = "CA15";
JetConfiguration2["algorithm" ] = "cambridge";
JetConfiguration2["cluster.R" ] = "1.5";
JetConfiguration2["cluster.PTmin" ] = "200.0";
cluster1->LoadJetConfiguration(JetConfiguration2);
all these inputs are interpreted by JetClusterer
machinery within MadAnalysis 5. Additionally Makefile
has been modified via CXXFLAGS += -DMA5_FASTJET_MODE
flag to enable FastJet use within MadAnalysis data structure, without this flag fastjet dependent accessors will not work.
- Analysis folder:
cms_exo_xx_yy/Build/SampleAnalyzer/User/Analyzer
This folder includes all the definitions written for the detector simulation;
$ ls
analysisList.h cms_exo_xx_yy.h new_smearer_reco.cpp new_tagger.cpp reco.h
cms_exo_xx_yy.cpp efficiencies.h new_smearer_reco.h new_tagger.h sigmas.h
cms_exo_xx_yy.*
are the analysis files and the rest are detector simulation modules (Please do not change those files when the analysis submitted in PAD only cms_exo_xx_yy.*
, cms_exo_xx_yy.info
and sfs_card_cms_exo_xx_yy.ma5
files are allowed. If you need to modify detector simulation, please modify sfs_card_cms_exo_xx_yy.ma5
and re-execute the workspace or include your personal modifications in cms_exo_xx_yy.cpp
).
- Multijet accessor: Each multijet is accessible within
c++
interface through their unique identifiers. Primary jet is a special case where one can useevent.rec()->jet()
accessor to see all primary jets. Rest of the jets can be found usingevent.rec()->jet("AK08")
orevent.rec()->jet("CA15")
respectively.
Update on expert mode compilation
In order to separate the execution method for Delphes and SFS-FastJet based analyses, a command line option has been added to the setup.sh
. In order to enable SFS-FastJet mode use following commands;
source setup.sh --with-fastjet
make clean all
or simply type source setup.sh -h
for details.
New shortcuts
RecJet := MA5::RecJetFormat *;
RecJets := std::vector<const Jet>;
RecTau := MA5::RecTauFormat *;
RecTaus := std::vector<const Tau>;
RecLepton := MA5::RecLeptonFormat *;
RecLeptons := std::vector<const Lepton>;
RecPhoton := MA5::RecPhotonFormat *;
RecPhotons := std::vector<const Photon>;
FastJet Contrib toolset
SoftDrop
MAfloat32 z_cut = 0.10;
MAfloat32 beta = 2.0;
Substructure::SoftDrop softDrop(beta, z_cut);
RecJets filtered_jets = filter(event.rec()->jets(), 20., 2.5);
// Only leading jet:
if (filtered_jets.size() > 0)
RecJet softdrop_jet = softDrop.Execute(Jets[0]);
// All jets:
RecJets softdrop_jets = softDrop.Execute(Jets);
Cluster
Execute with reconstructed event sample
Substructure::Cluster cluster(Substructure::antikt, 0.4, 20);
cluster.Execute(event, "AK4");
if (event.rec()->jets("AK4").size() > 0)
INFO << "AK4 Jet PT = " << event.rec()->jets("AK4")[0].pt() << endmsg;
Other execution modes:
MAfloat32 R = 0.5;
MAfloat32 pitmin = 20.; // optional
Substructure::Cluster cluster(Substructure::cambridge, R, ptmin);
// algorithm options antikt, cambridge, kt
RecJets Jets = filter(event.rec()->jets(), 20., 2.5);
// Only leading jet:
if (Jets.size() > 0)
RecJets jet = cluster.Execute(Jets[0]);
// All jets:
RecJets jets = cluster.Execute(Jets);
Filter clustered object
Substructure::Cluster cluster(Substructure::cambridge, 0.2);
RecJets Jets = filter(event.rec()->jets("AK8"), 20., 2.5);
if (Jets.size() > 0) {
RecJets FilteredSubjets = cluster.Execute(
Jets[0], [](const RecJet jet, const RecJet subjet)
{ return subject->pt() > jet->pt() * 0.05 ; }
);
for (auto &jet: FilteredSubjets) INFO << Jets[0]->pt()*0.05 << " < " << jet->pt() << endmsg;
}
Recluster
RecJets Jets = filter(event.rec()->jets(), 20., 2.5);
Substructure::Recluster recluster(Substructure::cambridge, 0.5);
// Only leading jet:
if (Jets.size() > 0)
RecJet jet = recluster.Execute(Jets[0]); // only gives the leading jet
// All jets:
RecJets jets = recluster.Execute(Jets); // only gives the leading jet for each reclustered jet
Nsubjettiness
RecJets Jets = filter(event.rec()->jets(), 20., 2.5);
Substructure::Nsubjettiness nsubjettiness(
1, // order
Substructure::Nsubjettiness::KT_Axes, // axes definition
Substructure::Nsubjettiness::UnnormalizedMeasure, // measure definition
1., // beta
-1., // R0
std::numeric_limits<double>::max() // Rcutoff
);
if (Jets.size() > 0)
MAdouble64 tau1 = nsubjettiness.Execute(Jets[0]);
Available axes definitions:
KT_Axes,
CA_Axes,
AntiKT_Axes,
WTA_KT_Axes,
WTA_CA_Axes,
Manual_Axes,
OnePass_KT_Axes,
OnePass_CA_Axes,
OnePass_AntiKT_Axes,
OnePass_WTA_KT_Axes,
OnePass_WTA_CA_Axes,
Available measure definitions
NormalizedMeasure, // (beta,R0)
UnnormalizedMeasure, // (beta)
NormalizedCutoffMeasure, // (beta,R0,Rcutoff)
UnnormalizedCutoffMeasure, // (beta,Rcutoff)
VariableR Plugin
Initialize through normal mode
define jet_algorithm my_varR VariableR rho=2000 minR=0 maxR=2 ptmin=15 exclusive=False clustertype=CALIKE strategy=Best
Initialisation through analysis:
MAfloat32 rho = 2000.;
MAfloat32 minR = 0.;
MAfloat32 maxR = 2.;
Substructure::ClusterType clusterType = Substructure::VariableR::AKTLIKE; // CALIKE, KTLIKE, AKTLIKE
Substructure::Strategy strategy = Substructure::VariableR::Best; // Best, N2Tiled, N2Plain, NNH, Native
MAfloat32 ptmin = 0.;
MAbool isExclusive = false;
Substructure::VariableR variableR(
rho, minR, maxR, clusterType, strategy, ptmin, isExclusive
);
OR
Substructure::VariableR variableR;
variableR.Initialize(rho, minR, maxR, clusterType, strategy, ptmin, isExclusive);
Execution for the reco event:
std::string JetID = "VarR"
variableR.Execute(event, JetID);
if (event.rec()->jets("VarR").size() > 0)
INFO << "VarR Jet PT = " << event.rec()->jets("VarR")[0].pt() << endmsg;
Execution with reconstructed jets:
RecJets Jets = filter(event.rec()->jets(), 20.);
RecJets VarRJets = variableR.Execute(Jets);
if (VarRJets.size() > 0)
INFO << "VarR Jet PT = " << VarRJets[0]->pt() << endmsg;
Execution with a single jet:
RecJets Jets = filter(event.rec()->jets(), 20.);
RecJets VarRJets = variableR.Execute(Jets[0]);
if (VarRJets.size() > 0)
INFO << "VarR Jet PT = " << VarRJets[0]->pt() << endmsg;
Filter clustered object
RecJets Jets = filter(event.rec()->jets("AK8"), 20., 2.5);
if (Jets.size() > 0) {
RecJets FilteredSubjets = variableR.Execute(
Jets[0], [](const RecJet jet, const RecJet subjet){ return subject->pt() > jet->pt() * 0.05 ; }
);
for (auto &jet: FilteredSubjets) INFO << Jets[0]->pt()*0.05 << " < " << jet->pt() << endmsg;
}
Pruner
execute with a single jet
Substructure::Pruner pruner(Substructure::antikt, 0.1, 1.);
vector<const RecJetFormat*> Jets = filter(event.rec()->jets(), 20.);
const RecJetFormat* PrunedJet = pruner.Execute(Jets[0]);
INFO << "prunned Jet PT = " << PrunedJet->pt() << endmsg;
vector execution
Substructure::Pruner pruner(Substructure::antikt, 0.1, 1.);
vector<const RecJetFormat*> Jets = filter(event.rec()->jets(), 20.);
vector<const RecJetFormat*> PrunedJets = pruner.Execute(Jets);
for (auto &jet: PrunedJets)
INFO << "prunned Jet PT = " << jet->pt() << endmsg;
Jet Filtering
std::vector<const RecJetFormat*> Jets = filter(event.rec()->jets(), 20., 2.5);
Substructure::Selector selector = Substructure::SelectorPtFractionMin(0.03);
MAfloat32 Rfilt = 0.2;
Substructure::Filter jetFilter(Rfilt, selector);
// Execute for a single jet
if (Jets.size() > 0) {
const RecJetFormat *filtjet = jetFilter.Execute(Jets[0]); // only gives the leading jet
INFO << filtjet->pt() << " " << Jets[0]->pt() << endmsg;
}
// execute as a vector
std::vector<const RecJetFormat *> filtjets = jetFilter.Execute(Jets);
Energy Correlator
std::vector<const RecJetFormat*> Jets = filter(event.rec()->jets(), 20., 2.5);
MAuint32 N = 1;
MAfloat32 beta = 0.1;
EnergyCorrelator::Measure measure = EnergyCorrelator::Measure::pt_R;
EnergyCorrelator::Strategy strategy = EnergyCorrelator::Strategy::storage_array;
Substructure::EnergyCorrelator EC(N,beta,measure,strategy);
INFO << EC.Execute(Jets[0]) << endmsg;
Tests were done for backwards compatibility
Not tested.
TODO
- [ ] Remove accesibility of FastJet through analysis.
- [x] Add fj-contrib wrappers.
Drawbacks: Drawbacks have been elevated by splitting delphes and sfs based execution modes. This hasn't been extensively tested yet but should solve the problem.