ldmx-sw Persist the actual simulated position of the ECAL hit rather than the cell center

Is your feature request related to a problem? Please describe.

This came up in https://github.com/LDMX-Software/ldmx-sw/issues/1482 Currently the ECAL simhit positions are exactly the same as the rechit positions since they go through the EcalGeometry. See https://github.com/LDMX-Software/ldmx-sw/blob/34d43a553320a549b81a9dd30f31ab7c826a0d93/SimCore/src/SimCore/SDs/EcalSD.cxx#L79-L110

In the code there is a comment about the change this issue brings up. Please note the other subsystems do use the real positions

HCAL https://github.com/LDMX-Software/ldmx-sw/blob/34d43a553320a549b81a9dd30f31ab7c826a0d93/SimCore/src/SimCore/SDs/HcalSD.cxx#L159
Scoring plans https://github.com/LDMX-Software/ldmx-sw/blob/34d43a553320a549b81a9dd30f31ab7c826a0d93/SimCore/src/SimCore/SDs/ScoringPlaneSD.cxx#L55
Tracker: https://github.com/LDMX-Software/ldmx-sw/blob/34d43a553320a549b81a9dd30f31ab7c826a0d93/SimCore/src/SimCore/SDs/TrackerSD.cxx#L55
TrigScint: https://github.com/LDMX-Software/ldmx-sw/blob/34d43a553320a549b81a9dd30f31ab7c826a0d93/SimCore/src/SimCore/SDs/TrigScintSD.cxx#L62

Although I should say that the TS has this line

 auto volumePosition{topTransform.Inverse().TransformPoint(G4ThreeVector())};

which I'm not sure what it does / is needed, and if it is needed why the other subsystems dont use it.

Describe the solution you'd like

Persist the actual simulated position of the ECAL hit rather than the cell center i.e.

hit.setPosition(position.x(), position.y(), position.z());

Describe alternatives you've considered

Keep as it is.

Oct 02 '24 21:10 tvami

Re: https://github.com/LDMX-Software/ldmx-sw/issues/1482#issuecomment-2389698590

Thanks @tomeichlersmith for the context. I just feel like that with this the meaning of the simhit position is lost. And you are right that it's not relevant to data.

I'd like to make a quick test to see how things change if we move to this. The question about how we choose a position in the cell could be moved to the next step: i.e. the individual simhits have their real positions and then we do the merging and at that point set the location for the merged hit (which could be an energy weighted avg maybe?)

Oct 02 '24 21:10 tvami

Screenshot 2024-10-02 at 19 32 04 The BDT score changed a little, which is a bit surprising

Oct 03 '24 02:10 tvami

Screenshot 2024-10-02 at 19 34 23 Number of rechits

Oct 03 '24 02:10 tvami

Screenshot 2024-10-02 at 19 35 31 total energy

Oct 03 '24 02:10 tvami

Quite a lot of changes in the log level https://github.com/LDMX-Software/ldmx-sw/actions/runs/11152155194/job/30997856789#step:6:22346

Trigger energy by quite often with 5 GeV, sometimes upto 60 GeV. I dont know if it's coming from here, the PR about the thresholds should not change the energy part only if it passes, and also this is 1e so there the threshold is not even changed.

Then the lin-reg changes, although the K-S does not say it's bad.

Anyway small changes here and there. I think we should have this, just for the geometry testing aspect, and so that the simhit positions have a specific meaning (and not just a copy of the rechit positions).

Oct 03 '24 02:10 tvami

I got carried away looking at the logs and seeing a difference of 0.59 crop up a bunch, so I used the logs to get a difference between the Esums. The plot omits the zero-difference events since they do not show up in the diff. I was mainly worried that all of the differences were exactly this number which would make me think we were missing a hit somewhere, but that's not the case. Still unsure where these deviations are coming from.

esumdiff

What I did

After downloading the inclusive PR validation artifact, unzipping and unpacking it. I used awk to process the diff output into a CSV which I then could plot.

diff output.log gold.log | awk -f esumdiff.awk | tr ';' ',' > esumdiff.csv

Plot made with Python 3.10.12

python3 -m venv venv
. venv/bin/activate
pip install matplotlib pandas
python3 plot.py

esumdiff.awk

BEGIN {
  printf "new;old;event"
}
{
  if ($1 == "<") {
    # original side of diff
    printf "%s", $11
  } else if ($1 == ">") {
    # other side of diff
    printf "%s%s\n", $11, $5
  }
}

plot.py

import pandas as pd
import matplotlib.pyplot as plt

esum = pd.read_csv('esumdiff.csv')
plt.hist(
    esum['new']-esum['old'],
    bins=40,
    range=(-0.20,1.80)
)
plt.xlabel('New - Gold ECal Esum / MeV')
plt.ylabel('Events')
plt.yscale('log')
plt.title('Inclusive PR Validation Sample')
plt.savefig('esumdiff.png', bbox_inches='tight')
plt.close()

Oct 03 '24 13:10 tomeichlersmith

Yeah so at this point it's a small difference but also not a null difference. Which to mean concludes that we should do the change, but we should not worry that it changed the physics we already published / presented. (I'm actually not sure if we have a published paper with v14 [and if this difference between GDML and EcalGeom came in for v14 or earlier])

Oct 03 '24 14:10 tvami