ldmx-sw Modernize LHE Reading Memory Allocation

Right now there is a lot of new and delete floating around when using the LHEReader, and I think we can clean this up really easily and use stuff like std::unique_ptr which handles this stuff automatically and safely.

Oct 29 '20 14:10 tomeichlersmith

I wonder if it makes sense to just move to using an xml reader to replace what we have now.

On Thu, Oct 29, 2020, 7:16 AM Tom Eichlersmith [email protected] wrote:

Right now there is a lot of new and delete floating around when using the LHEReader, and I think we can clean this up really easily and use stuff like std::unique_ptr which handles this stuff automatically and safely.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/LDMX-Software/ldmx-sw/issues/1316, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA4JMXCV2L2FH3I7YY3P2ADSNF2KBANCNFSM4TDZKMJQ .

Oct 29 '20 15:10 omar-moreno

I already did some prior research on LHE readers on an old issue: https://github.com/LDMX-Software/ldmx-sw/issues/482

Just adding this so no one repeats the same research I did.

Oct 29 '20 15:10 tomeichlersmith

I don't think using an xml reader is the best option.

I have coded up a quick program that scans the xml tree of an input LHE file and basically (since the LHE accords calls for newlines to be tossed around everywhere), we get a bunch of unknown XML tags #text: one for the content of some tag and one for the closing tag. For example,

<event>
   ...lines of event details...
</event>

Gets translated into

Node: 'event'
  Child: '#text' -> '...lines of event details...'
  Child: '#text' -> ''

Which would mean we would still have to write a custom translator from these XML output topology into something we understand. The basic problem is that the XML language expects each line to open an XML tag (or be a comment), e.g.

<event>
  <numparticles>5</numparticles>
  <weight>0.911E+02</weight>
  <particle>
    <pdgid>11</pdgid>
    ...other particle details
  </particle>
  ..other particles
</event>

This is fundamentally different from the LHE format, and so (unless there is an XML schema floating about that solves all these problems for me) I think staying with our custom LHE reader is our best bet.

Details for Playing with XercesC

Attached is my source file and example LHE that I was using. You need xerces-c installed somewhere to be able to do this.

Compile and Run

export XERCESC_DIR=<path-to-xerces-c-install>
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH+${LD_LIBRARY_PATH}:}${XERCESC_DIR}/lib
g++ -I${XERCESC_DIR}/include -L${XERCESC_DIR}/lib lhe.cxx -lxerces-c
./a.out

Other LHE Readers

I have previously found a C++ LHE file reader that is within the HepMC event record library. I have investigated this reader before and the main issues are that (1) it requires header and init blocks which are annoying for any scripts we have that generate LHE files manually (without using madgraph) and (2) it doesn't read in the #vertex flag that we have coded right now. The details of this investigation are on an old issue that is also linked above.

Jan 13 '21 17:01 tomeichlersmith

I've written a reader based on tinyxml for another project and have never run into the issues you are describing. Maybe it's the package you used?

On Wed, Jan 13, 2021 at 9:11 AM Tom Eichlersmith [email protected] wrote:

I don't think using an xml reader is the best option.

I have coded up a quick program that scans the xml tree of an input LHE file and basically (since the LHE accords calls for newlines to be tossed around everywhere), we get a bunch of unknown XML tags #text: one for the content of some tag and one for the closing tag. For example,
...lines of event details...
Gets translated into

Node: 'event' Child: '#text' -> '...lines of event details...' Child: '#text' -> ''

Which would mean we would still have to write a custom translator from these XML output topology into something we understand. The basic problem is that the XML language expects each line to open an XML tag (or be a comment), e.g.
5 0.911E+02 11 ...other particle details ..other particles
This is fundamentally different from the LHE format, and so (unless there is an XML schema floating about that solves all these problems for me) I think staying with our custom LHE reader is our best bet. Details for Playing with XercesC

Attached is my source file https://github.com/LDMX-Software/SimCore/files/5810065/lhe.cxx.txt and example LHE https://github.com/LDMX-Software/SimCore/files/5810066/eg.lhe.txt that I was using. You need xerces-c installed somewhere to be able to do this.

Compile and Run

export XERCESC_DIR= export LD_LIBRARY_PATH=${LD_LIBRARY_PATH+${LD_LIBRARY_PATH}:}${XERCESC_DIR}/lib g++ -I${XERCESC_DIR}/include -L${XERCESC_DIR}/lib lhe.cxx -lxerces-c ./a.out

Other LHE Readers

I have previously found a C++ LHE file reader http://home.thep.lu.se/~leif/LHEF/index.html that is within the HepMC event record library http://hepmc.web.cern.ch/hepmc/. I have investigated this reader before and the main issues are that (1) it requires header and init blocks which are annoying for any scripts we have that generate LHE files manually (without using madgraph) and (2) it doesn't read in the #vertex flag that we have coded right now. The details of this investigation are on an old issue https://github.com/LDMX-Software/ldmx-sw/issues/482 that is also linked above.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/LDMX-Software/ldmx-sw/issues/1316, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA4JMXEA63XXBA6Q6GGE2WLSZXH4ZANCNFSM4TDZKMJQ .

Jan 13 '21 17:01 omar-moreno

Can you link/attach the reader? Maybe I just need to configure the parser differently that the defaults.

Jan 13 '21 17:01 tomeichlersmith

It's probably easier if I just integrate it into ldmx-sw. No point in dumping it here.

On Wed, Jan 13, 2021 at 9:19 AM Tom Eichlersmith [email protected] wrote:

Can you link/attach the reader? Maybe I just need to configure the parser differently that the defaults.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/LDMX-Software/ldmx-sw/issues/1316, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA4JMXGIODGZ6MTWG2MZBTLSZXIYZANCNFSM4TDZKMJQ .

Jan 13 '21 17:01 omar-moreno