Review Overlay producer
Is your feature request related to a problem? Please describe.
In the Overlay producer we pick events based on a random number (uniform distribution)
int start_event = rndm_->Uniform(20., 4570);
// EventFile::skipToEvent handles actual number of events in file
int evNb = overlayFile_->skipToEvent(start_event);
but it's hardcoded to have a min and max, so we'd limit the events use in the PU library.
Describe the solution you'd like
Either have this configurable, or somehow deduce the events in the PU library automatically
Additional context
Came up in https://github.com/LDMX-Software/ldmx-sw/pull/1622#discussion_r1965701764
so i will give you some context for the number chosen here.
the main use for large numbers of pileup events is in central production. the number of events in a file produced centrally is somewhere between 5k (lower case for ecal PN) and 10k (inclusive). to match this, i use 10k events in all centrally produced pileup event files. the wrapping mechanism makes sure we can still use an event number that is outside the number of events in the "sim" file.
the total pileup batch is 1e8 events as per an old discussion. that means some reuse of pileup events, but it was agreed that it is unlikely that the features of the pileup itslef will be decisive in any event selection, as the sim event it is mixed with will always differ (and the likelihood of keeping the same pileup after trigger is probably small -- and in any event this can be checked).
thus, i don't think it needs to be larger. but making it smaller unnecessarily increases the risk of reusing events. the number of events used in the CI should not be a driving factor in this logic.
i guess you could set it larger (1e5?) if you want to make really sure you don't always miss the last events in a large pileup file, EventFile::skipToEvent() still takes the random number modulo the number of actual entries in the file
the number of events used in the CI should not be a driving factor in this logic.
Yeah I agree, my point was not to hardcode any number, but have it configurable. For example if we decide to have bigger than 10k events in the future.
So I'd suggest to have it a variable, and set the default to 10k for now. Does that sounds reasonable?
i guess i just fail to see the point/need to tailor this to every individual use case. if you feel strongly about it, go ahead.
@bryngemark can you help me to understand the choice for the lower end, why is it 20 and not just 0 (or 1)?
this was pretty random, i think i started in the end of not knowing whether the first event was 0 or 1 and then just decided to overshoot it a bit. i'd say pick 1 🙂