AliceO2 icon indicating copy to clipboard operation
AliceO2 copied to clipboard

[ITS2-18] ITS-MFT clusterer: implementation of digit squashing on neighbour ROFs

Open mconcas opened this issue 1 year ago • 22 comments

Hi @shahor02, as discussed last week i fellback on the original approach. This PR is to show some progress and discuss the details. Here is a possible implementation, my idea was that operating at digit level would have helped for MC implementation too, not 100% sure if correct assumption.

I have some points that are worth to mention:

  • I opted for an approach of finding neighbours and masking them for next ChipPixelData creation in subsequent ROF. This can be discussed, I am not sure it's perfectly fine (maybe for monte carlo labels?).
  • I tested only with and without passing the parameter --squash-overflow-pixels and with the attached reproducing command. I expect to have more to adapt.
  • Probably the loop over the digits of a specific ChipID can be optimized, storing some index to be started with next iteration (like you do in the filling loop for chips).

I attach plots for current state results if they make any sense: Before: no masking and no squashing Screenshot from 2022-08-28 14-42-38 After: unsing squashing of depth 2 ROFs (e.g. into ROF N we merge info from N+1 and N+2) Screenshot from 2022-08-28 17-59-11

I run on one TF from run LHC22f_520143, (3b) specifically: /alice/data/2022/LHC22f/520143/raw/1650/o2_rawtf_run00520143_tf00000002_epn224.tf with this command:

o2-raw-tf-reader-workflow --input-data raw/raw-list --onlyDet ITS --delay 2 --shm-segment-size 24000000000 |  \
o2-itsmft-stf-decoder-workflow -b --squash-overflow-pixels --configKeyValues "ITSClustererParam.maxBCDiffToMaskBias=-10" | \
o2-its-cluster-writer-workflow --disable-mc -b

Hope this goes in the right direction. Cheers, Matteo

mconcas avatar Aug 28 '22 13:08 mconcas

Thanks for the prompt look to this!

  1. the squashed digits should have the same sorting (in column, then in row) as original ones, this is required by the clusterer algo. Not sure in the current implementation this is guaranteed.

Ok, this I somehow expected. The overflow pixels are indeed just "pushed_back" into the chip digits vector. Would a 2-D sorting (row-col) done at the end of squashing work?

  1. Since July we are storing CTF with clusters made of unmasked digits. The processing of this data assumes that we unroll clusters to digits in the ITS/MFT entropy decoder...

In principle this should work with any digit reader that is required to squash the chip data, so I assume this would work also for this case.

Concerning your plot with results: in principle, I would expect that squashing does not increase the number of clusters found in the ROF which was the destination of squashing, since the squashed digits by construction are neighbours of digits in the ROF.

Yes, this I was also expecting, that's why I was not so convinced.

mconcas avatar Aug 28 '22 18:08 mconcas

Ok, this I somehow expected. The overflow pixels are indeed just "pushed_back" into the chip digits vector. Would a 2-D sorting (row-col) done at the end of squashing work?

yes

In principle this should work with any digit reader that is required to squash the chip data, so I assume this would work also for this case.

agree, but one should pass corresponding options

shahor02 avatar Aug 28 '22 18:08 shahor02

Hi @shahor02, thanks, indeed the sorting made tha trick. Screen Shot 2022-08-29 at 8 56 09 AM

mconcas avatar Aug 29 '22 06:08 mconcas

Hi @shahor02 I added the CLI parameter to the ITS reco workflow and tested with one ctf of run LHC22f_522711 Here some comparisons: Before: no masking, no squashing 1_ctf_no_squashing

After: no masking, yes squashing: Screen Shot 2022-08-29 at 1 54 32 PM

And also something from the tracker: Blue: before, Red: after

Eta: photo_2022-08-29 13 58 44

Phi: photo_2022-08-29 13 59 21

Z seeding vertex: Screen Shot 2022-08-29 at 1 50 21 PM

Unfortunately for vertex the statistics is not high enough to see the cusps to effectively fade away.

Also I see a too high degradation of the time elapsed in the two cases. This I did not see with previous run used for tests, think it's mainly because the former was Single_3b_2_2_2 while the latter is 25ns_1935b_1922_1602_1672_192bpi_14inj_3INDIVs, both at 202 KHz.

Degradation as expected scales with the number of digits per ROF.

mconcas avatar Aug 29 '22 12:08 mconcas

Thanks, the ratio of tracks is reasonable but the eta shape is not as flat is it was in Oct.2021 test. For the clusters the effect is not evident from the plot with dense bunch filling. Perhaps it is worth to pring the digits before and after the squashing for some well-populated chip in IB to check by eye if it behaves as expected. The squashing of MC labels is not covered yet, right?

What is the timing benchmark change?

shahor02 avatar Aug 29 '22 12:08 shahor02

Thanks, the ratio of tracks is reasonable but the eta shape is not as flat is it was in Oct.2021 test. For the clusters the effect is not evident from the plot with dense bunch filling. Perhaps it is worth to pring the digits before and after the squashing for some well-populated chip in IB to check by eye if it behaves as expected.

Yes, I will try to do that.

The squashing of MC labels is not covered yet, right?

No. It's the next in the line.

What is the timing benchmark change?

I did not implemented a dedicated benchmark yet, to give a bit of the order of magnitude. Before was:

real	1m13.028s
user	1m36.525s
sys	0m16.267s

after is:

real	5m29.984s
user	6m23.664s
sys	0m15.945s

Factor ~4.5 😅

Nonetheless, I think there are some things that can be improved, such as iterations over all the digits in ROF for every chip, and maybe the management of the vector of flags to keep used digits. I have no idea what will be the impact tho. Potentially one culd also distribute chips to different threads...

mconcas avatar Aug 29 '22 12:08 mconcas

This PR did not have any update in the last 30 days. Is it still needed? Unless further action in will be closed in 5 days.

github-actions[bot] avatar Sep 29 '22 01:09 github-actions[bot]

Hi @shahor02, MC label assgnment should be online. I attach also the result for run 519041, where it is visible the effect of squashing pn last of the four bars. Screenshot 2022-10-10 at 15 11 12 Screenshot 2022-10-10 at 16 38 09

Also eta for tracks. Screenshot 2022-10-10 at 16 33 20

I think this is now working, although I have the feeling is not going to (completely) fix the "M" shape of tracks. Which in a sense is also consistent with the study that Ivan did (see usual ticket) which was on a masked run.

Shall we give this a try on a small test run at p2?

Then we can think of extending it for MFT, which should be just a matter forwarding settings.

mconcas avatar Oct 10 '22 15:10 mconcas

@mconcas thanks! In principle, yes, but do you have CPU benchmark for e.g. 500 kHz data? We must match the processing rate at least 0.5Hz of incoming TF rate at ~500 kHz, better to 1Hz, may need to return pipelining options.

shahor02 avatar Oct 10 '22 15:10 shahor02

Hi @shahor02, I changed the logic to enable this, so that we can configure it with keyvals, eg:

--configKeyValues='ITSClustererParam.maxROFSquashingDepth=2;ITSClustererParam.maxRowColDiffToSquash=1;'

The first parameter sets the number of ROFs to check. If depths is 0 (default) -> No Squash. The second is the usual row/col distance used in masking.

As these parameters are now specified in ClustererParam.h, I think they can be used in the MFTClustererParam as well.

mconcas avatar Oct 14 '22 09:10 mconcas

Hi @mconcas why not use for the neighbour definition the same maxRowColDiffToMask? The meaning is exactly the same. As for the number of ROFs to squash, I would not add it as a parameter to avoid changing ClusterParam each time we change ROF length in the AlpideParam. The signal stays over the threshold at most 8\mus, so with a smaller ROF length we need to check/squash pixels in 3 consecutive ROFs (applying the same maxBCDiffToMaskBias check to make sure we don't squash over the gaps) while for longer ROFs only 2 ROFs need to be checked. I would propose adding to the ClustererParam

float maxSOTMUS = 8.; // max expected signal over threshold in \mus

and defining dynamically number of ROFs to check (e.g. here: https://github.com/AliceO2Group/AliceO2/blob/07fa0397725db72a362a6ff2f7c086278a7a325e/Detectors/ITSMFT/ITS/workflow/src/ClustererSpec.cxx#L121)

int rofBC = mClusterer->isContinuousReadOut() ? alpParams.roFrameLengthInBC : (alpParams.roFrameLengthTrig / o2::constants::lhc::LHCBunchSpacingNS); // ROF length in BC
int nrofsToSquash = 2 + rofBC*o2::constants::LHCBunchSpacingMUS / (clParams.maxSOTMUS);

shahor02 avatar Oct 14 '22 10:10 shahor02

@shahor02 ok, and by setting e.g. a negative values to maxSOTMUS we could assume it's a way to disable the squashing. Can it work? Or do you prefer an explicit argument for enable/disable it?

mconcas avatar Oct 14 '22 11:10 mconcas

@mconcas yes, you can add a logic

int rofBC = mClusterer->isContinuousReadOut() ? alpParams.roFrameLengthInBC : (alpParams.roFrameLengthTrig / o2::constants::lhc::LHCBunchSpacingNS); // ROF length in BC
int nbcSeparation= rofBC + clParams.maxBCDiffToMaskBias;  // enable masking or squashing for ROFs separated by < than this number of BCs
nrofsToSquash = -1; // disable squashing
if (clParams.maxSOTMUS > 0) {
  nrofsToSquash = 2 + rofBC*o2::constants::LHCBunchSpacingMUS /clParams.maxSOTMUS; // use sqaushing
}
mClusterer->setMaxBCSeparationToMask(nbcToMask);
mClusterer->setNROFsToSquash(nrofsToSquash);

shahor02 avatar Oct 14 '22 12:10 shahor02

applying the same maxBCDiffToMaskBias check to make sure we don't squash over the gaps

BTW: I am not currently applying any selection on ROFs to be squashed based on the max distance in BC between two ROFs. I systematically look in the following 2 rofs, regardless. If I find the first empty or with no digits for current processed chip, I then just interrupt the process. If you want to realise something more refined (and closer to masking, I now realise) I can.

mconcas avatar Oct 14 '22 12:10 mconcas

The ROFs are not guaranteed to be contiguous even in continuous readout (e.g. empty ROF may be absent), one should always check if 2 ROFs are adjascent.

shahor02 avatar Oct 14 '22 12:10 shahor02

@mconcas ready to be merged?

shahor02 avatar Oct 19 '22 21:10 shahor02

@mconcas ready to be merged?

In principle yes, I just wanted to x-check two things with you.

  1. I did not implement any explicit logic to forbid the usage of masking and squashing together. I wanted to ask you how this case should be managed, should we crash the WF with a very clear fatal error? Or just ignore one of the two and log the choice?
  2. About the default values for ClustererParams. For the moment the defaults are set such as the squashing is applied, exactly like the masking: https://github.com/AliceO2Group/AliceO2/blob/9af47203131c8cd1f31a1920236a05908b3c4fba/Detectors/ITSMFT/common/reconstruction/include/ITSMFTReconstruction/ClustererParam.h#L38 https://github.com/AliceO2Group/AliceO2/blob/9af47203131c8cd1f31a1920236a05908b3c4fba/Detectors/ITSMFT/common/reconstruction/include/ITSMFTReconstruction/ClustererParam.h#L39 Is that OK? Will these parameters be updated from CCDB too?

mconcas avatar Oct 20 '22 06:10 mconcas

Hi @mconcas

For the simultaneous squashing / masking request: I would simply throw an exception with corresponding message. At the moment I would disable the squashing by default, so that the object loaded from the CCDB imposes the same behaviour. Can change this later. One more question: did you try running with multiple threads (--nthreads > 1) ?

Cheers, Ruben

shahor02 avatar Oct 20 '22 08:10 shahor02

For the simultaneous squashing / masking request: I would simply throw an exception with corresponding message.

Ok

At the moment I would disable the squashing by default, so that the object loaded from the CCDB imposes the same behaviour. Can change this later.

Ok, changing to -10, then.

One more question: did you try running with multiple threads (--nthreads > 1) ?

I just tried, I get same results with --nthreads=2 passed both to stf-decoder and its-reco-wf (clusterer). Seems reasonable, as all the squashing happens inside the Reader::getNextChipData, which is not split over threads.

mconcas avatar Oct 20 '22 09:10 mconcas

@shahor02 thanks, a lot for the extensive study. I am now investigating it and I am able to reproduce the plots.

I have one question about the masking you applied, what are the arguments, exactly? With the one I set I reproduce the discrepancy, but it is less frequent and there are some cases where squashing is "better" at suppressing than masking, which is not what I see in your plots and what I would expect.

E.g. (balck is raw, red is masking and blue is squashig) Screenshot 2022-10-24 at 11 44 48

mconcas avatar Oct 24 '22 09:10 mconcas

Hi @mconcas

I was using (for masking)

CLPAR=" --configKeyValues ITSClustererParam.maxBCDiffToMaskBias=10;ITSClustererParam.maxBCDiffToSquashBias=-10;ITSClustererParam.maxSOTMUS=32.; "
o2-raw-tf-reader-workflow $GLO_ARG --input-data lst.txt --onlyDet ITS | \
o2-itsmft-stf-decoder-workflow $GLO_ARG --no-clusters --digits $CLPAR | \
o2-its-reco-workflow $GLO_ARG --disable-mc --digits-from-upstream --disable-tracking --trackerCA $CLPAR

shahor02 avatar Oct 24 '22 10:10 shahor02

@shahor02 PR is updated. I have also updated stf-decoder and ClustererSpec for MFT.

mconcas avatar Oct 26 '22 16:10 mconcas