AliceO2 icon indicating copy to clipboard operation
AliceO2 copied to clipboard

Qatable

Open f3sch opened this issue 1 year ago • 3 comments

@miranov25

f3sch avatar Oct 29 '24 15:10 f3sch

REQUEST FOR PRODUCTION RELEASES: To request your PR to be included in production software, please add the corresponding labels called "async-" to your PR. Add the labels directly (if you have the permissions) or add a comment of the form (note that labels are separated by a ",")

+async-label <label1>, <label2>, !<label3> ...

This will add <label1> and <label2> and removes <label3>.

The following labels are available async-2023-pbpb-apass4 async-2023-pp-apass4 async-2024-pp-apass1 async-2022-pp-apass7 async-2024-pp-cpass0

github-actions[bot] avatar Oct 29 '24 15:10 github-actions[bot]

Hello @f3sch

Thank you for the pull request. Here is the link to the corresponding material:

github issue:

  • https://github.com/AliceO2Group/AliceO2/issues/13293 first presentation and jupyter notebook: *https://indico.cern.ch/event/1433903/#11-performance-of-local-occupa
  • https://indico.cern.ch/event/1433903/contributions/6038742/attachments/2897887/5083753/trackDelta.ipynb
  • https://indico.cern.ch/event/1433903/contributions/6038742/attachments/2897887/5116573/%CE%9F2-5095-dEdx%20parameterization%20(2).pdf
    • slide 10

miranov25 avatar Oct 30 '24 14:10 miranov25

image

miranov25 avatar Oct 30 '24 14:10 miranov25

undrafting to check CI Do not merge until @miranov25 completes the closure tests. Converter https://github.com/AliceO2Group/O2Physics/pull/8404

f3sch avatar Nov 12 '24 09:11 f3sch

Thank you for following. We are still fixing digital signal processing, so I have to finish other stuff. I haoe I will manage to check it this week, as I want to have it soon in productions.

miranov25 avatar Nov 12 '24 13:11 miranov25

Internal information - test at GSI using debug streamer:

rsync -avzt --progress [email protected]:/afs/cern.ch/user/f/fschlepp/public/output.tar.gz 
 /lustre/alice/users/miranov/NOTESData/alice-tpc-notes/JIRA/O2-5095/testDelta

miranov25 avatar Nov 18 '24 14:11 miranov25

Hello @f3sch,

Thank you for implementing the changes. I have identified two major issues so far:

  1. Coding Resolution: In my proposal described during the AOT meeting, I suggested using coding with 10-25% resolution. However, you implemented 1-sigma coding, which is insufficient.

  2. dE/dx Scaling Issue: I will provide a detailed description in the next comment.

Additionally, I would recommend using consistent variable names in both the source code and the debug stream to make it easier to test and ensure code consistency.

The code for testing I committed to the tpc gitlab, I am not sure if we have some code for the O2 testing.

  • https://gitlab.cern.ch/alice-tpc-offline/alice-tpc-notes/-/blob/43767e53e02d1e2200d45d19ea3edd026a426f9d/JIRA/O2-5095/test_trackQADelta.C

I produced QA plots with that.

Thank you for your help

Marian

image

miranov25 avatar Nov 18 '24 16:11 miranov25

Track Residual Coding:

  • As I wrote earlier, we will need a few samples for delta coding—for this, I recommend using 5 samples per sigma, as shown in the slide.
  • In the current code, the Beta value was incorrectly coded, leading to an underestimation of the error for high dE/dx (low Beta) particles.
  • code as in the pull request
const float beta0 = std::min(tpcOrig.getdEdx().dEdxMaxTPC / 50.f, 1.f);
  • should be:
    tree->SetAlias("betaC","sqrt(min(50./mdEdx.dEdxMaxTPC, 1.))");

After the fix, the pulls—normalized errors—are now close to one (as should be) as shown in the slide.

Test code used in slide:

  • https://gitlab.cern.ch/alice-tpc-offline/alice-tpc-notes/-/blob/47468ba524987e1b17f90d190131be94dd8f0548/JIRA/O2-5095/test_trackQADelta.C#L76-126

image

miranov25 avatar Nov 19 '24 07:11 miranov25

@miranov25 at your leisure, I updated the code and uploaded updated files for you to test at the path posted above.

f3sch avatar Nov 19 '24 11:11 f3sch

Hello @f3sch,

Thank you very much for making the changes. I see you have modified the Beta0 estimator as we discussed and agreed upon.

However, there is another adjustment needed related to the sampling. In the current code, you are using sampling with 1 sigma resolution, but we need to sample it with 5 bins per sigma (to be able to correct later)

I will add the proposed modification directly to the code, temporarily as a hardcoded number.

Additional modifications needed

  • Use sampling of sigma/5 as I described above.
  • Ensure consistent variable names between the code and the streamer, so the same variables are reflected in the test code.
  • Include the expected sigma in the output, so it does not need to be defined as an alias in the test code, so test will be easier to interpret.

For the sampling of sigma/5, we should see approximately 5 bins in dRefContY, but currently, we only have 1 bin. In the example I use case of the dY for the TPCITS but it is similar in all other cases.

Best regards Marian

   tree->Draw("trackQAHolder.dRefContY>>his(100,-50,50)","itstpc_dP0!=0","");

image

miranov25 avatar Nov 19 '24 16:11 miranov25

For now, I suggest modifying the scaleCont and scaleGlo functions by using nBins as a parameter.

A more elegant approach would be to create sigmaCont and sigmaGlo functions, and then apply scaling using (sigma<...>)/nBins in the subsequent code. Like that the variables and code will be self documented

miranov25 avatar Nov 19 '24 16:11 miranov25

@miranov25 I do not understand, please send me the code you want to have changed, or add it here as a commit.

f3sch avatar Nov 20 '24 07:11 f3sch

Hello @f3sch,

To minimize the scope of changes while maintaining the desired functionality and improving the precision of residual coding, I propose adding a new parameter:

constexpr float trackQAScaleBins{5.f};

I assume 5 bins will be a reasonable compromise between the disc size/entropy and precision, such we can use that later for corrections.

This parameter can then be used in the calculation of residuals as shown below:

// Calculate deltas for contributors  
trackQAHolder.dRefContY = safeInt8Clamp((itsCopy.getY() - tpcCopy.getY()) * scaleGlo(0) * o2::aod::track::trackQAScaleBins);  
trackQAHolder.dRefContZ = safeInt8Clamp((itsCopy.getZ() - tpcCopy.getZ()) * scaleGlo(1) * o2::aod::track::trackQAScaleBins);  
trackQAHolder.dRefContSnp = safeInt8Clamp((itsCopy.getSnp() - tpcCopy.getSnp()) * scaleGlo(2) * o2::aod::track::trackQAScaleBins);  
trackQAHolder.dRefContTgl = safeInt8Clamp((itsCopy.getTgl() - tpcCopy.getTgl()) * scaleGlo(3) * o2::aod::track::trackQAScaleBins);  
trackQAHolder.dRefContQ2Pt = safeInt8Clamp((itsCopy.getQ2Pt() - tpcCopy.getQ2Pt()) * scaleGlo(4) * o2::aod::track::trackQAScaleBins);  

// Calculate deltas for global track against averaged contributors  
trackQAHolder.dRefGloY = safeInt8Clamp(((itsCopy.getY() + tpcCopy.getY()) * 0.5f - gloCopy.getY()) * scaleGlo(0) * o2::aod::track::trackQAScaleBins);  
trackQAHolder.dRefGloZ = safeInt8Clamp(((itsCopy.getZ() + tpcCopy.getZ()) * 0.5f - gloCopy.getZ()) * scaleGlo(1) * o2::aod::track::trackQAScaleBins);  
trackQAHolder.dRefGloSnp = safeInt8Clamp(((itsCopy.getSnp() + tpcCopy.getSnp()) * 0.5f - gloCopy.getSnp()) * scaleGlo(2) * o2::aod::track::trackQAScaleBins);  
trackQAHolder.dRefGloTgl = safeInt8Clamp(((itsCopy.getTgl() + tpcCopy.getTgl()) * 0.5f - gloCopy.getTgl()) * scaleGlo(3) * o2::aod::track::trackQAScaleBins);  
trackQAHolder.dRefGloQ2Pt = safeInt8Clamp(((itsCopy.getQ2Pt() + tpcCopy.getQ2Pt()) * 0.5f - gloCopy.getQ2Pt()) * scaleGlo(4) * o2::aod::track::trackQAScaleBins);  

This approach keeps the changes minimal while providing better control over the scaling and precision of the residual coding.

Best regards,
Marian

miranov25 avatar Nov 20 '24 17:11 miranov25

@miranov25 thank you, find the updated code and the updated file uploaded for you to check.

f3sch avatar Nov 21 '24 12:11 f3sch

Thank you for your input. It seems there’s still an issue. On average, the results look fine, but the unit test is failing.

The coding appears to be less precise than expected. One problem is that we should be using the nearest integer for rounding, but currently, that’s not the case. Additionally, I’m noticing tails in the data, which could indicate another issue.

At the moment, I haven’t pinpointed the exact cause of tail. I’m reviewing both the unit test and the code, and both seem fine on the surface. I will find the last problem. I plan to present pull request tomorrow, as we need this code urgently for the PbPb processing.

@f3sch, in the meantime, could you prepare a data volume estimator that is officially recognized by the OFFLINE team? I typically use tree->Print() for quick checks, but I recall there’s a “standard tool” available that we can utilize.

Thank you for your assistance!

miranov25 avatar Nov 24 '24 09:11 miranov25

Support material for statement above: Tail and rounding error (nint) - Unit test

Definition in unit test:

    tree->SetAlias("qpt","trackITSTPCProp.mP[4]");
    tree->SetAlias("beta","sqrt(min(50./mdEdx.dEdxMaxTPC, 1.))");
    tree->SetAlias("x","qpt/beta");
    for (int i=0; i<5; i++){
        tree->SetAlias(Form("scaleCont%d",i),Form("scaleBins/sqrt(%f**2 + (%f*x)**2)",trackQAScaleContP0[i],trackQAScaleContP1[i]));
        tree->SetAlias(Form("scaleGlo%d",i),Form("scaleBins/sqrt(%f**2 + (%f*x)**2)",trackQAScaleGloP0[i],trackQAScaleGloP1[i]));
    }
    tree->SetAlias("itstpc_dP0","(trackITSProp.mP[0]-trackTPCProp.mP[0])");
    tree->SetAlias("itstpc_dP0QA","(trackQAHolder.dRefContY)/scaleCont0");
    tree->SetAlias("itstpc_Delta0","(itstpc_dP0-itstpc_dP0QA)");
    tree->SetAlias("itstpc_Delta0Norm","(itstpc_dP0-itstpc_dP0QA)*scaleCont0");

image

miranov25 avatar Nov 24 '24 10:11 miranov25

Fix Scaling Issue for dRefCont Parameters

The issue with the tails has been identified and should be fixed. The incorrect scaling factor scaleGlo was used instead of scaleCont for trackQAHolder.dRefCont. This error was not immediately apparent due to the similarity in scaling values.

Bug Fix Applied to All 5 dRefCont Parameters:

// Original code:
trackQAHolder.dRefContY = safeInt8Clamp((itsCopy.getY() - tpcCopy.getY()) * scaleGlo(0));

// Corrected code:
trackQAHolder.dRefContY = safeInt8Clamp((itsCopy.getY() - tpcCopy.getY()) * scaleCont(0));

This fix ensures proper scaling and eliminates the observed tail behavior.

miranov25 avatar Nov 24 '24 18:11 miranov25

In case I used the same "buggy" value for decoding of delta everything is like expected and Unit test is OK. Normalized delta is -i nterval [-0.5,0.5] nad residuals after coding are very small (sigma/5.)as they should be.

image

Plot above should be compared to plot in comment above: https://github.com/AliceO2Group/AliceO2/pull/13633#issuecomment-2495923725

miranov25 avatar Nov 24 '24 18:11 miranov25

Also, other tests with the deltaGlobal P0 and deltaCont P4 are working as expected - after local fix https://github.com/AliceO2Group/AliceO2/pull/13633#issuecomment-2496163311

Test code: https://gitlab.cern.ch/alice-tpc-offline/alice-tpc-notes/-/blob/b65b22cc85a1f27cb862cd653e1c97b0cf5aabbf/JIRA/O2-5095/test_trackQADelta.C#L115-179

As an example unit test for the Glo deltaY:

    tree->SetAlias("glo_dP0","0.5*(trackITSProp.mP[0]+trackTPCProp.mP[0])-trackITSTPCProp.mP[0]");         // track delta
    tree->SetAlias("glo_dP0QA","(trackQAHolder.dRefGloY)/scaleGlo0");          // track delta decoded frmp dRefContY
    tree->SetAlias("glo_Delta0","(glo_dP0-glo_dP0QA)");
    tree->SetAlias("glo_Delta0Norm","(glo_dP0-glo_dP0QA)*scaleGlo0");

image

miranov25 avatar Nov 24 '24 19:11 miranov25

Subject: Delta Coding: Disk Size Impact and Full Data Size Report -

Using the tree->Print() function, I calculated the impact of adding 2 x 5 Bytes for delta coding in memory. These bytes are compressed on disk with a compression factor of approximately 5. For tracks with ITS+TPC associations, this adds around 2 additional bytes per track on disk.

Below is the full output of the tree->Print() function for delta branches. If there’s a preferred method for reporting data sizes or metrics, please let me know. I’d appreciate any guidance on using a standard tool for this purpose, as it would provide a more consistent evaluation, which you expected.

root [6] O2trackqa_001->GetListOfBranches()->Print("", "fDeltaRef*")
Collection name='TObjArray', class='TObjArray', size=23
 *Br    0 :fDeltaRefContParamY : fDeltaRefContParamY/B                        *
*Entries :   138216 : Total  Size=     138846 bytes  File Size  =      27159 *
*Baskets :        1 : Basket Size=     139240 bytes  Compression=   5.09     *
..............................................................................
 *Br    1 :fDeltaRefContParamZ : fDeltaRefContParamZ/B                        *
*Entries :   138216 : Total  Size=     138846 bytes  File Size  =      35413 *
*Baskets :        1 : Basket Size=     139240 bytes  Compression=   3.91     *
..............................................................................
 *Br    2 :fDeltaRefContParamSnp : fDeltaRefContParamSnp/B                    *
*Entries :   138216 : Total  Size=     138856 bytes  File Size  =      27635 *
*Baskets :        1 : Basket Size=     139240 bytes  Compression=   5.00     *
..............................................................................
 *Br    3 :fDeltaRefContParamTgl : fDeltaRefContParamTgl/B                    *
*Entries :   138216 : Total  Size=     138856 bytes  File Size  =      28595 *
*Baskets :        1 : Basket Size=     139240 bytes  Compression=   4.84     *
..............................................................................
 *Br    4 :fDeltaRefContParamQ2Pt : fDeltaRefContParamQ2Pt/B                  *
*Entries :   138216 : Total  Size=     138861 bytes  File Size  =      30003 *
*Baskets :        1 : Basket Size=     139240 bytes  Compression=   4.61     *
..............................................................................
 *Br    5 :fDeltaRefGloParamY : fDeltaRefGloParamY/B                          *
*Entries :   138216 : Total  Size=     138841 bytes  File Size  =      25885 *
*Baskets :        1 : Basket Size=     139240 bytes  Compression=   5.34     *
..............................................................................
 *Br    6 :fDeltaRefGloParamZ : fDeltaRefGloParamZ/B                          *
*Entries :   138216 : Total  Size=     138841 bytes  File Size  =      32123 *
*Baskets :        1 : Basket Size=     139240 bytes  Compression=   4.31     *
..............................................................................
 *Br    7 :fDeltaRefGloParamSnp : fDeltaRefGloParamSnp/B                      *
*Entries :   138216 : Total  Size=     138851 bytes  File Size  =      26160 *
*Baskets :        1 : Basket Size=     139240 bytes  Compression=   5.29     *
..............................................................................
 *Br    8 :fDeltaRefGloParamTgl : fDeltaRefGloParamTgl/B                      *
*Entries :   138216 : Total  Size=     138851 bytes  File Size  =      25496 *
*Baskets :        1 : Basket Size=     139240 bytes  Compression=   5.42     *
..............................................................................
 *Br    9 :fDeltaRefGloParamQ2Pt : fDeltaRefGloParamQ2Pt/B                    *
*Entries :   138216 : Total  Size=     138856 bytes  File Size  =      26742 *
*Baskets :        1 : Basket Size=     139240 bytes  Compression=   5.17     *
..............................................................................

miranov25 avatar Nov 24 '24 21:11 miranov25

@miranov25 The size estimate:

AO2D_new:
Tree O2trackqa_001          31.7681       MB, 17.5856       MB (compressed) 2.172% of total AO2D size on disk

AO2D:
Tree O2trackqa              21.5125       MB, 16.4135       MB (compressed) 2.030% of total AO2D size on disk

Produced for LHC23zzk 544508 40kHz pb-pb in 1 AO2D with:

#!/usr/bin/env bash

GLOSET=" -b  --shm-segment-size 160000000000 --monitoring-backend no-op:// --hbfutils-config o2_tfidinfo.root,upstream --timeframes-rate-limit 2 --timeframes-rate-limit-ipcid 554508 --fairmq-rate-logging 0 --early-forward-policy noraw "
SEED=42
MAXTF=-1

o2-reader-driver-workflow --max-tf $MAXTF $GLOSET | \
        o2-aod-producer-workflow $GLOSET \
        --info-sources ITS-TPC,TPC-TRD,ITS-TPC-TRD,TPC-TOF,ITS-TPC-TOF,TPC-TRD-TOF,ITS-TPC-TRD-TOF,ITS,TPC,TOF,FT0,FDD,FV0,TRD \
        --aod-writer-keep dangling --aod-writer-resfile "AO2D" --aod-writer-resmode RECREATE --disable-mc --pipeline aod-producer-workflow:1  --aod-writer-maxfilesize 4000 --lpmp-prod-tag LHC23zzk --reco-pass apass4_debug --trackqc-fraction 0.1 --configKeyValues "keyval.output_dir=/dev/null;keyval.input_dir=./workdir;;" --seed $SEED #--with-streamers "1"

f3sch avatar Nov 25 '24 10:11 f3sch

@f3sch Great. Will you also post the updated streamer so I can perform a check? If the new check passes, I will recommend proceeding with the full request.

miranov25 avatar Nov 25 '24 12:11 miranov25

Subject: Feedback on Modifications and tpcTime0 Compression

Hello @f3sch,

Thank you for the modifications. The new file has passed the unit test, and the ITS+TPC deltas are now stored correctly.

  • test: https://gitlab.cern.ch/alice-tpc-offline/alice-tpc-notes/-/blob/06c7ab97b3da182a86d2ec841c158ed210b8e2da/JIRA/O2-5095/test_trackQADelta.C

I have a comment regarding the tpcTime0 modification. After reviewing the trackQAHolder.tpcTime0, I ran the following check:

tree->Draw("trackQAHolder.tpcTime0:trackTPCOrig.mTime0")

Currently, the tpcTime0 is not compressed effectively (compression factor 1.2) and takes up the same data volume as the 8 dEdx variables, which are much better compressed.

We could improve the compression of tpcTime0 by storing it as a delta relative to a reference time0. If I recall correctly, I previously used a delta relative to the global time, which reduced the data volume by a factor of 4-8.

Given the upcoming deadline next week, I won’t make this a blocker. I suggest proceeding with the pull request as is and testing the option for relative coding in parallel.

Marian

*............................................................................*
*Br    1 :fTPCTime0 : fTPCTime0/F                                            *
*Entries :    39016 : Total  Size=     156650 bytes  File Size  =     131076 *
*Baskets :        1 : Basket Size=     157088 bytes  Compression=   1.19     *
*............................................................................*
*Br    2 :fTPCDCAR  : fTPCDCAR/S                                             *
*Entries :    39016 : Total  Size=      78609 bytes  File Size  =      61783 *
*Baskets :        1 : Basket Size=      79056 bytes  Compression=   1.26     *
*............................................................................*
*Br    3 :fTPCDCAZ  : fTPCDCAZ/S                                             *
*Entries :    39016 : Total  Size=      78609 bytes  File Size  =      63963 *
*Baskets :        1 : Basket Size=      79056 bytes  Compression=   1.22     *
*............................................................................*
*Br    4 :fTPCClusterByteMask : fTPCClusterByteMask/b                        *
*Entries :    39016 : Total  Size=      39646 bytes  File Size  =      19512 *
*Baskets :        1 : Basket Size=      40040 bytes  Compression=   2.00     *

image

miranov25 avatar Nov 26 '24 16:11 miranov25

Hello @sawenzel and @AliceO2Group/framework-admins,

Could you please review and approve the pull request? There are two approvals still missing, and I’m unsure who the current admins are.

The deadline for the Release Validation is next Monday, but I’ll be attending Physics Week, so it would be ideal to have everything finalized this week.

Thank you

miranov25 avatar Nov 28 '24 15:11 miranov25

Hello @pzhristov, @ddobrigk, & other @AliceO2Group/framework-admins

Could you please approve the pull request? Felix has provided an estimate of the data volume in the comment above:
[Pull Request Comment](https://github.com/AliceO2Group/AliceO2/pull/13633#issuecomment-2497493430). We need this change for APass1, which means it should be included in the release validation tomorrow.

Regarding the fractional usage of the trackQA table within APass1, I will discuss this during the ALICE Physics Week and will send you additional material via chat. Since approximately 50% of the statistics are impacted by significant distortion fluctuations that we won't be able to resolve in APass1, I believe we can afford to store this information for now (as we reconstruct only a fraction of the data).

If we restrict the reconstruction to only "good CTFs" (as agreed last week) the output data volume should not pose an issue for APass1. For APass2, which will follow once we resolve the fluctuation problem, we can revisit and adjust the fraction as needed.

I will be flying to Salerno in 2 hours and will be offline for some time, with very limited connectivity.

miranov25 avatar Dec 01 '24 13:12 miranov25

Hi @miranov25 @f3sch, thanks, this looks good to me, all good to merge! (I have no rights, so maybe @pzhristov can do the honors if alright with him :-) )

ddobrigk avatar Dec 02 '24 08:12 ddobrigk

Thank you

miranov25 avatar Dec 02 '24 10:12 miranov25