ldmx-sw icon indicating copy to clipboard operation
ldmx-sw copied to clipboard

CLUE algorithm and Phoenix visualiser

Open Lysarina opened this issue 1 year ago • 9 comments

I am updating ldmx-sw, here are the details.

What are the issues that this addresses?

This resolves #1411

I have worked with ECal clustering this summer and implemented the CLUE algorithm from CMS. The main goal was to use this for electron clustering so I've strived for number of clusters == number of electrons, and also working to get a high energy purity (i.e. how much of the energy in the cluster comes from the same initial electron). It works pretty well! I've also added a simple reclustering clause to try and handle merged clusters, which results in a more prominent peak at number of clusters == number of electrons, but also introduces some overcounting, so it is currently an optional parameter.

This also partly resolves #1394 in getting the cluster producer to work

Additionally, I have added a Phoenix-based visualiser LDMX-VIS, located in EventDisplay/ldmx-vis. I have also made an analyzer VisGenerator in DQM (maybe not the right folder for this) that writes event data into JSON files that can be visualised by Phoenix.

Fixes

While I feel mostly happy with the code, I created what is basically a copy of WorkingCluster.h in Ecal called WorkingEcalCluster, that stores EcalHits as normal objects instead of pointers, as when I started I did not really know how to handle pointers :^) I never had time to fix so that CLUE would use WorkingCluster instead but this would be a nice fix. WorkingEcalCluster does contain some other improvements such as a parameterless constructor and eliminating EcalGeometry as it is not actually needed in that file.

Check List

  • [x] I successfully compiled ldmx-sw with my developments
  • [x] I ran my developments and the following shows that they are successful.

Below are histograms produced by EcalClusterAnalyzer, an analyzer I created to analyze the performance of the clustering

Below are some graphs for unmodified CLUE 2 electrons number_of_clusters_2e energy_purity_2e clusterless_percentage_2e 3 electrons number_of_clusters_3e energy_purity_3e clusterless_percentage_3e

versus the initial algorithm (TemplatedClusterFinder) 2 electrons number_of_electrons_2e energy_purity_2e clusterless_percentage_2e 3 electrons number_of_electrons_3e energy_purity_3e clusterless_percentage_3e

Reclustering with CLUE 2e number_of_clusters_2e_reclustering 3e number_of_clusters_3e_reclustering

Visualisation examples 3e_clue_32 layers_7

Here is the (slightly messy) config I use for ldmx fire

from LDMX.Framework import ldmxcfg
p = ldmxcfg.Process('sim')
events = 100
p.maxEvents = events
p.termLogLevel = 0
p.logFrequency = 1

nbrOfElectrons = 2

# Initial algo
seedThresh = 350. 
cutoff = 10.

# CLUE
CLUE = True
debug = False
recluster = False
layers = 1
dc = 0.
rhoc = 550.
deltac = 10.
deltao = 40.

inputFiles = []
for i in range(1, 31):
    inputFiles.append(f"data/{nbrOfElectrons}e/sim_{i}_{nbrOfElectrons}e.root")
p.inputFiles = inputFiles
p.outputFiles = [
 "all.root"
 ]
p.histogramFile = "clusters.root"

import LDMX.Ecal.EcalGeometry # geometry required by sim

# import chip / geometry conditions
import LDMX.Ecal.ecal_hardcoded_conditions
import LDMX.Ecal.digi as ecal_digi
import LDMX.DQM.dqm as dqm
import LDMX.Ecal.ecalClusters as cl
import LDMX.Ecal.ecal_trig_digi as etrigdigi
import LDMX.Trigger.trigger_energy_sums as etrig

json = dqm.VisGenerator()
json.filename = "vis.json"
json.originIdAvailable = True
json.nbrOfElectrons = nbrOfElectrons

cluster = cl.EcalClusterProducer()
cluster.seedThreshold = seedThresh
cluster.cutoff = cutoff
# CLUE
cluster.CLUE = CLUE
cluster.dc = dc
cluster.rhoc = rhoc
cluster.deltac = deltac
cluster.deltao = deltao
cluster.debug = debug
cluster.nbrOfLayers = layers
cluster.reclustering = recluster

clan = cl.EcalClusterAnalyzer()
clan.nbrOfElectrons = nbrOfElectrons

p.logPerformance = True

p.sequence.extend([
    ecal_digi.EcalDigiProducer(),
    ecal_digi.EcalRecProducer(),
    cluster,
    clan,
    json,
    ])

Lysarina avatar Aug 28 '24 13:08 Lysarina

You might be right @tomeichlersmith that visualization and clustering are conceptually separate... even if they were used in tandem for this project. Do you have some handy github wizardry that you could share with @Lysarina for how to cherry pick the different parts into two separate PRs, to help this move along?

Also, any good ideas for where to keep the visualizer analyzer? In EventDisplay or DQM?

bryngemark avatar Sep 02 '24 08:09 bryngemark

In this specific situation, I would just make new branches and copy over your updates to those new branches. This also give you the opportunity to make branch names that reference the issue number. You can use git checkout to do this copying by specifying the files you want to get from that branch after --. (Note: tab-complete probably won't work since these files don't exist on trunk.)

git switch trunk
git pull
git switch -c 1411-ecal-clue-clustering
# just an example, you'll need to do more of these to get all of the files
git checkout ella-dev-clustering -- Ecal/include/Ecal/CLUE.h

You can git commit whenever you want. The files you've checkouted this way will already be git added. I would suggest many small commits so that the commit messages reference what you are adding but to each their own.

The same procedure can be done for the vis branch (make sure to switch back to trunk before creating the new branch), but I have some cleanup notes that may be helpful at this stage.

  • I think the processors that produce JSON for Phoenix should reside in the EventDisplay submodule. You can delete everything that is currently there - it is broken and no-one has used it in a long time. Use git mv after you git checkout a file so that git registers the move.
  • I do not want a full copy of nlohmann json.hpp in our source repo. I think adding it as a submodule git submodule add https://github.com/nlohmann/json.git EventDisplay/json makes the most sense and then using add_subdirectory within EventDisplay/CMakeLists.txt. https://json.nlohmann.me/integration/cmake/#embedded
  • I would like to avoid copying a pile of Typescript into ldmx-sw so if we can use Phoenix another way, I would prefer that; however, if we need to create our own Typescript application in order to properly use Phoenix and all its features, then I second @tvami 's points about removing the extra files pertaining to other experiments. (Maybe we put the Typescript app in some other repository? Does it even run from within the container? These are the types of details I would iron out on an event display-specific PR)

tomeichlersmith avatar Sep 02 '24 13:09 tomeichlersmith

* I do not want a full copy of nlohmann json.hpp in our source repo. I think adding it as a submodule `git submodule add https://github.com/nlohmann/json.git EventDisplay/json` makes the most sense and then using `add_subdirectory` within `EventDisplay/CMakeLists.txt`. https://json.nlohmann.me/integration/cmake/#embedded

Is there a reason for not wanting the full header? I think that is a super common way to use it

EinarElen avatar Sep 03 '24 10:09 EinarElen

I don't have a good reason, I just like that its easier to find the original project (and thus its documentation). I also want to point out that acts also uses nlohmann/json so we could do an install into the image for both to use in the future.

Probably should have worded that comment less strongly - I would like to move it to a submodule but its not necessary. If we move to putting it in the image, then we can just as easily remove the header here as well.

tomeichlersmith avatar Sep 03 '24 14:09 tomeichlersmith

nlohmann/json.hpp 3.10.5 is shipped with ROOT 6.32.08 which I found out exploring

https://github.com/LDMX-Software/docker/pull/103

so it looks like it will be built into the image from ROOT.

tomeichlersmith avatar Jan 28 '25 19:01 tomeichlersmith

Hi @Lysarina - I do not want to lose this excellent work and so I'm checking to see if I should take over this PR. Are you stuck on what to do next? (I can help clear up any confusions.) Or have you been pulled into other work? (In which case, I can take over the PR to make sure your work gets shared with others on LDMX.)

tomeichlersmith avatar Mar 19 '25 16:03 tomeichlersmith

Hi @tomeichlersmith , I'm very sorry for completely disappearing from this PR! I only worked at LDMX as a summer intern and did not quite manage to finish the last details, my plan was to do it after the summer but then I got pulled into studies and never got around to this. I think if you are able to take over the PR it would probably be smoother, and I would greatly appreciate it! Thank you for checking up!

Lysarina avatar Mar 23 '25 17:03 Lysarina

Thank you for responding and thank you for your excellent work here :) I will pull it across the finish line :checkered_flag:

tomeichlersmith avatar Mar 24 '25 13:03 tomeichlersmith

thank you for stepping in @tomeichlersmith ! do you think you could try to understand why this branch breaks the overlay producer? what i see is that the ecal simhits get messed up when pileup is added, which makes everything downstream wrong.

ecalESum ecalClusterE

The difference to before is that for all the development work, a multiparticle gun was used.

bryngemark avatar Mar 27 '25 10:03 bryngemark

I have copied the EventDisplay/ldmx-vis directory into its own repo: https://github.com/LDMX-Software/event-vis I have left this repo private because I can't get it to run, but the code is there to resume work.

tomeichlersmith avatar May 21 '25 17:05 tomeichlersmith

And with that we should close this PR. Thanks everybody who contributed in this!

tvami avatar May 21 '25 18:05 tvami