ldmx-sw Trigger Scintillator reco: redo reconstruction pipeline following ECal scheme

How...?

Apr 17 '20 15:04 awhitbeck

You might want to talk with @nhanvtran about how the digi pipeline works because the HCal and the trigger scintillator have similar methods of detection and may be similar in some respects.

The following information is about the DAQ pipeline, the trigger pipeline will be different.

My Knowledge about the ECal Scheme

The basic flow of the reconstruction pipeline is

(sim hits) ->Digitizer-> (digi hits) ->Reconstructor-> (rec hits)

where we make the following definitions.

Hit Type	Definition
Sim	Output by Geant4 simulation in Sensitive Detectors. Mimics actual physical energy deposits.
Digi	C++ object that stores what comes off the detector (e.g. ADC and detector IDs). Incorporates a noise simulation and any readout effects.
Rec	Uses the digitized hits (and knowledge of how the readout chips work) to extract physical variables like energy and spatial position. Maybe tries to suppress any noise.

The reason I think we should break it up like this is it makes our simulation pipeline more closely resemble what our data pipeline will be:

(raw binary out from detector) ->Unpacker-> (digi hits) ->Reconstructor-> (rec hits)

Here is my presentation from the last ldmx-sw workshop where I have some helpful diagrams about the digi pipeline. I am talking specifically about the ECal, but you can get the idea.

The first draft of your reco pipeline can be just splitting the producer you have now into two different producers and defining an event bus object that hands the information from one to the other.

Apr 17 '20 16:04 tomeichlersmith

sure, I can follow up with @nhanvtran. One question about the workflow, though. How should we work in gen-level information?
for example, I want to propagate information from simhits about the provenance of energy deposited in each cell. If I follow the pipeline strictly, I will lose this information at the Digi step. Should we just build customized objects that allow for gen-truth level information to be matched to reco level information? I wonder if other subsystems have thought about this...

Apr 17 '20 17:04 awhitbeck

I think persisting the gen-level information through the digi step is dangeous because it might lead algorithms to use gen-level information (and be "too" good at their job).

One way to obtain the gen level information is to compare sections of the subsystem using the detector IDs. The sim calorimter hits store all of their contributing particles (Contribs) so you can see exactly what caused the hit in your detector and you can compare to the same detector id at the rec level.

I did this in the EcalDigiVerify analyzer in the Ecal module. It compares the simulated energy deposited and the reconstructed energy deposited for each cell with a reconstructed hit in it. It searches the simulated hits for the matching cell by looking for the detector ID.

Apr 17 '20 18:04 tomeichlersmith

thanks for the reference. I would like something that can be saved in the event. I think we can probably get away with having a Digi collection with no gen-level information and a simple gen-hit (not sure of the vernacular) collection where sim-hits are summed for all cells plus other information can be stored about the progenitors of the energy deposited. This way, Det-IDs can be matched up, as you say. The details will depend on the use cases, but already I would like to isolate energy deposits from beam electrons and secondaries. It will likely also be useful to know which beam electron hit which cells.

Apr 28 '20 20:04 awhitbeck

okay, so first thing first: I am starting to think about the Digi data structure. See branch iss709.

I tried to reimplement what @tomeichlersmith had for EcalDigiCollection. However, this inherits from DigiCollection, which has a fixed word size for samples. For TriggerScintillator the natural unit is either 14 or 16 bits per channel. Do we care about not using 16/32 bits? I guess yes.

One solution is that we just pack the data so that every word is 2 time-samples. This is not so bad since the charge will be spread over roughly 2 time-samples. Then we can configure the Digi collection as long as we save an even number of samples. thought? It's not elegant... but could work. Also, probably the struct is not as user friendly as it could be currently.

Apr 28 '20 23:04 awhitbeck

@awhitbeck You bring up some good points about DigiCollection. Currently DigiCollection is too restrictive. Instead DigiCollection should be an interface (or abstract class if some methods need to be defined).

@tomeichlersmith Can you update Digi collection so it's more flexible?

For example, you can template addDigi and the vector of samples can use std::variant instead. Also, getChannelID should be simply getID because some subsystems require more than a channel as an ID. You can also simply make getID pure virtual. Also, getNumDigis and getSampleWord should be dropped in favor of a method that returns all digis. Finally, make sure to make all public methods are virtual so they case be overridden in the derived class.

Apr 29 '20 00:04 omar-moreno

I absolutely can update the digi collection so it's more flexible.

My worry:

I'm not sure if we can save things to the event bus by using inheritance through std::variant, if that doesn't work, it would be better for us to ditch the parent digi collection and just have all subsystems have their own digi collection. I'll open up an issue to keep track of this.

Apr 29 '20 13:04 tomeichlersmith

Talking through it with Andrew a bit, it's probably OK if you make it a short.

On Wed, Apr 29, 2020 at 6:47 AM Tom Eichlersmith [email protected] wrote:

I absolutely can update the digi collection so it's more flexible.

My worry:

I'm not sure if we can save things to the event bus by using inheritance through std::variant, if that doesn't work, it would be better for us to ditch the parent digi collection and just have all subsystems have their own digi collection. I'll open up an issue to keep track of this.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/LDMX-Software/ldmx-sw/issues/709#issuecomment-621219198, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA4JMXEWQ7N7ED75Z7YOFJLRPAVV3ANCNFSM4MK3C6FA .

Apr 29 '20 16:04 omar-moreno

One more thought, just for the record. It was agreed that it made sense to make the Digi code such that all arrays were in one collection. So, we need to tackle issue 559 first. In any case, I'll keep working to get this setup for a single array.

May 05 '20 18:05 awhitbeck

Reigniting this issue, @GNiramay is working on the code to do this. It is in iss709v2.

Currently, in iss709v2L there is a new producer exits: TrigScintQIEDigiProducer that simulates the pulse shape of hits, digitizes pulses into a collection of 5 time-samples (the number is configurable). Digi information that corresponds to ADC, TDC, and capID (all information that each QIE provides) is saved. Stores a collection of TrigScintQIEDigis in event.

The original TrigScintDigiProducer is still there and still takes sim hits as an input. I suggest that we keep this as a parallel path for now and we continue to build a new RECO pipeline off of the TrigScintDigis collection. @GNiramay will do some cleanup to the code, but it would be great if we can push a PR through to get this into master before other major changes happen. This will help us to study detailed information related to timing, pileup, etc.

Oct 06 '20 13:10 awhitbeck

hey @awhitbeck what's the status of this development? I see the iss709 branch exists https://github.com/LDMX-Software/ldmx-sw/compare/trunk...iss709 but iss709v2 does not. Are the devs in iss709 branch still useful for the future developments?

Sep 18 '24 15:09 tvami