genproductions
genproductions copied to clipboard
Slow MG5 v261 source with $gran = 1
Dear all,
I open this issue for colleagues who working on mg261.
For DY+01234j,(maybe any process with large number of subprocess/integration channels),
MG5_v261 source itself has problem with $gran = 1.
I reported this to the launchpad :https://answers.launchpad.net/mg5amcnlo/+question/676453
The MG authors will investigate this problem. I guess it would be related to moving from gen_ximprove.f to gen_ximprove.py and memory resource.
The symptom is that it takes ~14000sec to generate 500 events in v261 with gran=1. (For v260, it takes 417 sec.)
(Here is summary for this test : http://147.47.242.72/USER/jhchoi/generator_group/slow_mg261/slow_dy_4j_mg261_181129.pdf)
(1)Actually, I have a question on the concept of the "granularity".
If you look at the MG5_v261 source from the below link:
https://cms-project-generators.web.cern.ch/cms-project-generators/MG5_aMC_v2.6.1.tar.gz
and find "madgraph/madevent/gen_ximprove.py",
at Line1637, it selects which channel it will generate.
For a channel, if (expected # of events) < (Random_number*granularity), then it doesn't generate events for that channel.
I guess this "granularity" is for reducing runtime and skipping rare processes whose #of events is too small.
The random number is necessary for the case where an event generation is divided into lots of gridjobs. For example, if we generate 1,000,000 events with 200 jobs and do not use the random number, 5,000 events per job will be generated. From this reference(https://cp3.irmp.ucl.ac.be/projects/madgraph/wiki/IntroGrid), recomended granularity value = sqrt(# of events) = 1,000. So, processes whose expected #of events > 1,000 must have events after event generation. If we set gran = 71 ~ sqrt(5000) for each job, a process whose expected # of events = 50 has no events. But, after integrating all of the 200 jobs, the expected # of events for the process = 50*200 = 10,000 which is too large to ignore the process. But without random number, this process is ignored and its # of events will be 0 even its expectaion = 10,000 > 1,000.
Do I understand this concept correctly?
(2)The author said CMS wants to set granularity to "1", which means we don't reject any channel even its expected number of events is very small. But it could be inefficient as well. Do you know why CMS stick to fix it to "one"?
Best regards,
Junho
It might be related to https://answers.launchpad.net/mg5amcnlo/+question/664751 "...For small sample generated, it is likely that many channels are discarded (which is by design) and I do not know if those channel are included or not in the total cross-section. If they are not, then indeed this can increase the fluctuation of the reported cross-section out of the gridpack (even if the average should still be ok actually)"
@kdlong may comment more