Possible memory corruption in cling
Check duplicate issues.
- [X] Checked for duplicates
Description
Since the switch to ROOT 6.30/02 (LCG 105) we started to experience segfaults related to dictionaries. The most straightforward reproducer is just a #include of a specific header
What makes me think that there may be a memory corruption is that I tried to isolate which part of that header was triggering the segfault and I noticed that (on a subset of the header) I could make the segfault appear and disappear just shuffling some class definitions.
I also find weird that the segfault seems to be related to an atexit function in libCling.so:
===========================================================
#10 0x00007f05fa744a7e in ?? ()
#11 0x00007ffd45bee240 in ?? ()
#12 0x00007f060b10d028 in ?? ()
#13 0x00007ffd45bee270 in ?? ()
#14 0x00007f05fa745920 in ?? ()
#15 0x00007f060b10c1a0 in ?? ()
#16 0x00007ffd45bee260 in ?? ()
#17 0x00007ffd45bee2c0 in ?? ()
#18 0x00007f05fa745b0d in ?? ()
#19 0x000000000204fc10 in ?? ()
#20 0x00000000133b2cc8 in ?? ()
#21 0x00000000133b2cc0 in ?? ()
#22 0x00007f060309016c in (anonymous namespace)::local_cxa_atexit(void (*)(void*), void*, cling::Interpreter*) () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so
#23 0x00007ffd45bee260 in ?? ()
#24 0x00007f060b10d110 in ?? ()
#25 0x00007f060b10d020 in ?? ()
#26 0x00007f0607fe2e7a in ?? () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so
#27 0x00007f05fa741095 in ?? ()
#28 0x00007f05fa740f20 in ?? ()
#29 0x00007f060471d882 in (anonymous namespace)::GenericLLVMIRPlatformSupport::initialize(llvm::orc::JITDylib&) () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so
#30 0x00007f06031155f3 in cling::IncrementalExecutor::runStaticInitializersOnce(cling::Transaction&) () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so
#31 0x00007f0603092698 in cling::Interpreter::executeTransaction(cling::Transaction&) () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so
#32 0x00007f0603125b4a in cling::IncrementalParser::commitTransaction(llvm::PointerIntPair<cling::Transaction*, 2u, cling::IncrementalParser::EParseResult, llvm::PointerLikeTypeTraits<cling::Transaction*>, llvm::PointerIntPairInfo<cling::Transaction*, 2u, llvm::PointerLikeTypeTraits<cling::Transaction*> > >&, bool) () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so
#33 0x00007f0603128d98 in cling::IncrementalParser::Compile(llvm::StringRef, cling::CompilationOptions const&) () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so
#34 0x00007f06030933dc in cling::Interpreter::DeclareInternal(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, cling::CompilationOptions const&, cling::Transaction**) const () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so
#35 0x00007f0603095986 in cling::Interpreter::process(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, cling::Value*, cling::Transaction**, bool) () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so
#36 0x00007f06031781a7 in cling::MetaProcessor::process(llvm::StringRef, cling::Interpreter::CompilationResult&, cling::Value*, bool) () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so
#37 0x00007f0602e677f7 in HandleInterpreterException (metaProcessor=0x308b020, input_line=0x4194ba0 "#line 1 "ROOT_prompt_0"n#include <LoKi/ParticleCuts.h>", compRes=
0x7ffd45beeafc: cling::Interpreter::kSuccess, result=0x7ffd45beeb00) at /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.04/src/ROOT/6.30.04/core/metacling/src/TCling.cxx:2436
===========================================================
Reproducer
With the test_env.sh included in test_env.zip on lxplus.cern.ch:
❯ hx test_env.sh
-bash: hx: command not found
marcocle in 🌐 lxplus913 in ~/tmp/root-issue
❯ vim test_env.sh
marcocle in 🌐 lxplus913 in ~/tmp/root-issue took 2m35s
❯ bash
marcocle in 🌐 lxplus913 in ~/tmp/root-issue
❯ . test_env.sh
marcocle in 🌐 lxplus913 in ~/tmp/root-issue
❯ root
------------------------------------------------------------------
| Welcome to ROOT 6.30/04 https://root.cern |
| (c) 1995-2024, The ROOT Team; conception: R. Brun, F. Rademakers |
| Built for linuxx8664gcc on Feb 03 2024, 17:20:15 |
| From heads/master@tags/v6-30-04 |
| With g++ (GCC) 13.1.0 |
| Try '.help'/'.?', '.demo', '.license', '.credits', '.quit'/'.q' |
------------------------------------------------------------------
root [0] #include <LoKi/ParticleCuts.h>
*** Break *** segmentation violation
===========================================================
There was a crash (kSigSegmentationViolation).
This is the entire stack trace of all threads:
===========================================================
#0 0x00007f060a0d89fa in wait4 () from /lib64/libc.so.6
#1 0x00007f060a04b243 in do_system () from /lib64/libc.so.6
#2 0x00007f060ac59eb2 in TUnixSystem::Exec (this=0x1fbd500, shellcmd=0x9457020 "/cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/etc/gdb-backtrace.sh 212891 1>&2") at /build/jenkins/workspace/lcg_release_pip
#3 0x00007f060ac5a753 in TUnixSystem::StackTrace (this=0x1fbd500) at /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.04/src/ROOT/6.30.04/core/unix/src/TUnixSystem.cxx:2411
#4 0x00007f060ac5e16c in TUnixSystem::DispatchSignals (this=0x1fbd500, sig=kSigSegmentationViolation) at /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.04/src/ROOT/6.30.04/core/unix/src/TUnixSystem.cxx:3631
#5 0x00007f060ac560e0 in SigHandler (sig=kSigSegmentationViolation) at /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.04/src/ROOT/6.30.04/core/unix/src/TUnixSystem.cxx:402
#6 0x00007f060ac5e06f in sighandler (sig=11) at /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.04/src/ROOT/6.30.04/core/unix/src/TUnixSystem.cxx:3602
#7 0x00007f060ac47a32 in textinput::TerminalConfigUnix::HandleSignal (this=0x7f060af75d80 <textinput::TerminalConfigUnix::Get()::s>, signum=11) at /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.04/src/ROOT/6.30.
#8 0x00007f060ac47736 in (anonymous namespace)::TerminalConfigUnix__handleSignal (signum=11) at /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.04/src/ROOT/6.30.04/core/textinput/src/textinput/TerminalConfigUnix.
#9 <signal handler called>
#10 0x00007f05fa744a7e in ?? ()
#11 0x00007ffd45bee240 in ?? ()
#12 0x00007f060b10d028 in ?? ()
#13 0x00007ffd45bee270 in ?? ()
#14 0x00007f05fa745920 in ?? ()
#15 0x00007f060b10c1a0 in ?? ()
#16 0x00007ffd45bee260 in ?? ()
#17 0x00007ffd45bee2c0 in ?? ()
#18 0x00007f05fa745b0d in ?? ()
#19 0x000000000204fc10 in ?? ()
#20 0x00000000133b2cc8 in ?? ()
#21 0x00000000133b2cc0 in ?? ()
#22 0x00007f060309016c in (anonymous namespace)::local_cxa_atexit(void (*)(void*), void*, cling::Interpreter*) () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so
#23 0x00007ffd45bee260 in ?? ()
#24 0x00007f060b10d110 in ?? ()
#25 0x00007f060b10d020 in ?? ()
#26 0x00007f0607fe2e7a in ?? () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so
#27 0x00007f05fa741095 in ?? ()
#28 0x00007f05fa740f20 in ?? ()
#29 0x00007f060471d882 in (anonymous namespace)::GenericLLVMIRPlatformSupport::initialize(llvm::orc::JITDylib&) () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so
#30 0x00007f06031155f3 in cling::IncrementalExecutor::runStaticInitializersOnce(cling::Transaction&) () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so
#31 0x00007f0603092698 in cling::Interpreter::executeTransaction(cling::Transaction&) () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so
#32 0x00007f0603125b4a in cling::IncrementalParser::commitTransaction(llvm::PointerIntPair<cling::Transaction*, 2u, cling::IncrementalParser::EParseResult, llvm::PointerLikeTypeTraits<cling::Transaction*>, llvm::PointerIntPairInfo<cling
#33 0x00007f0603128d98 in cling::IncrementalParser::Compile(llvm::StringRef, cling::CompilationOptions const&) () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so
#34 0x00007f06030933dc in cling::Interpreter::DeclareInternal(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, cling::CompilationOptions const&, cling::Transaction**) const () from /cvmfs/lhcb.cern
#35 0x00007f0603095986 in cling::Interpreter::process(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, cling::Value*, cling::Transaction**, bool) () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6
#36 0x00007f06031781a7 in cling::MetaProcessor::process(llvm::StringRef, cling::Interpreter::CompilationResult&, cling::Value*, bool) () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so
#37 0x00007f0602e677f7 in HandleInterpreterException (metaProcessor=0x308b020, input_line=0x4194ba0 "#line 1 \"ROOT_prompt_0\"\n#include <LoKi/ParticleCuts.h>", compRes=
0x7ffd45beeafc: cling::Interpreter::kSuccess, result=0x7ffd45beeb00) at /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.04/src/ROOT/6.30.04/core/metacling/src/TCling.cxx:2436
#38 0x00007f0602e683c4 in TCling::ProcessLine (this=0x20461a0, line=0x412ae70 "#line 1 \"ROOT_prompt_0\"\n#include <LoKi/ParticleCuts.h>", error=0x7ffd45beeedc) at /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.0
#39 0x00007f060aab78bf in TApplication::ProcessLine (this=0x200fe60, line=0x412ae70 "#line 1 \"ROOT_prompt_0\"\n#include <LoKi/ParticleCuts.h>", sync=false, err=0x7ffd45beeedc) at /build/jenkins/workspace/lcg_release_pipeline/build/proj
#40 0x00007f060b14a763 in TRint::ProcessLineNr (this=0x200fe60, filestem=0x7f060b15c757 "ROOT_prompt_", line=0x419af40 "#include <LoKi/ParticleCuts.h>", error=0x7ffd45beeedc) at /build/jenkins/workspace/lcg_release_pipeline/build/projec
#41 0x00007f060b149fa1 in TRint::HandleTermInput (this=0x200fe60) at /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.04/src/ROOT/6.30.04/core/rint/src/TRint.cxx:648
#42 0x00007f060b1477cd in TTermInputHandler::Notify (this=0x413b570) at /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.04/src/ROOT/6.30.04/core/rint/src/TRint.cxx:133
#43 0x00007f060b14c187 in TTermInputHandler::ReadNotify (this=0x413b570) at /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.04/src/ROOT/6.30.04/core/rint/src/TRint.cxx:125
#44 0x00007f060ac58367 in TUnixSystem::CheckDescriptors (this=0x1fbd500) at /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.04/src/ROOT/6.30.04/core/unix/src/TUnixSystem.cxx:1322
#45 0x00007f060ac577bc in TUnixSystem::DispatchOneEvent (this=0x1fbd500, pendingOnly=false) at /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.04/src/ROOT/6.30.04/core/unix/src/TUnixSystem.cxx:1077
#46 0x00007f060ab4290f in TSystem::InnerLoop (this=0x1fbd500) at /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.04/src/ROOT/6.30.04/core/base/src/TSystem.cxx:390
#47 0x00007f060ab426a4 in TSystem::Run (this=0x1fbd500) at /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.04/src/ROOT/6.30.04/core/base/src/TSystem.cxx:340
#48 0x00007f060aab8367 in TApplication::Run (this=0x200fe60, retrn=false) at /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.04/src/ROOT/6.30.04/core/base/src/TApplication.cxx:1890
#49 0x00007f060b1492e2 in TRint::Run (this=0x200fe60, retrn=false) at /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.04/src/ROOT/6.30.04/core/rint/src/TRint.cxx:501
#50 0x0000000000401447 in main (argc=1, argv=0x7ffd45bf1438) at /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.04/src/ROOT/6.30.04/main/src/rmain.cxx:84
===========================================================
The lines below might hint at the cause of the crash. If you see question
marks as part of the stack trace, try to recompile with debugging information
enabled and export CLING_DEBUG=1 environment variable before running.
You may get help by asking at the ROOT forum https://root.cern/forum
preferably using the command (.forum bug) in the ROOT prompt.
Only if you are really convinced it is a bug in ROOT then please submit a
report at https://root.cern/bugs or (preferably) using the command (.gh bug) in
the ROOT prompt. Please post the ENTIRE stack trace
from above as an attachment in addition to anything else
that might help us fixing this issue.
===========================================================
#10 0x00007f05fa744a7e in ?? ()
#11 0x00007ffd45bee240 in ?? ()
#12 0x00007f060b10d028 in ?? ()
#13 0x00007ffd45bee270 in ?? ()
#14 0x00007f05fa745920 in ?? ()
#15 0x00007f060b10c1a0 in ?? ()
#16 0x00007ffd45bee260 in ?? ()
#17 0x00007ffd45bee2c0 in ?? ()
#18 0x00007f05fa745b0d in ?? ()
#19 0x000000000204fc10 in ?? ()
#20 0x00000000133b2cc8 in ?? ()
#21 0x00000000133b2cc0 in ?? ()
#22 0x00007f060309016c in (anonymous namespace)::local_cxa_atexit(void (*)(void*), void*, cling::Interpreter*) () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so
#23 0x00007ffd45bee260 in ?? ()
#24 0x00007f060b10d110 in ?? ()
#25 0x00007f060b10d020 in ?? ()
#26 0x00007f0607fe2e7a in ?? () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so
#27 0x00007f05fa741095 in ?? ()
#28 0x00007f05fa740f20 in ?? ()
#29 0x00007f060471d882 in (anonymous namespace)::GenericLLVMIRPlatformSupport::initialize(llvm::orc::JITDylib&) () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so
#30 0x00007f06031155f3 in cling::IncrementalExecutor::runStaticInitializersOnce(cling::Transaction&) () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so
#31 0x00007f0603092698 in cling::Interpreter::executeTransaction(cling::Transaction&) () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so
#32 0x00007f0603125b4a in cling::IncrementalParser::commitTransaction(llvm::PointerIntPair<cling::Transaction*, 2u, cling::IncrementalParser::EParseResult, llvm::PointerLikeTypeTraits<cling::Transaction*>, llvm::PointerIntPairInfo<cling::Transaction*, 2u, llvm::PointerLikeTypeTraits<cling::Transaction*> > >&, bool) () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so
#33 0x00007f0603128d98 in cling::IncrementalParser::Compile(llvm::StringRef, cling::CompilationOptions const&) () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so
#34 0x00007f06030933dc in cling::Interpreter::DeclareInternal(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, cling::CompilationOptions const&, cling::Transaction**) const () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so
#35 0x00007f0603095986 in cling::Interpreter::process(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, cling::Value*, cling::Transaction**, bool) () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so
#36 0x00007f06031781a7 in cling::MetaProcessor::process(llvm::StringRef, cling::Interpreter::CompilationResult&, cling::Value*, bool) () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so
#37 0x00007f0602e677f7 in HandleInterpreterException (metaProcessor=0x308b020, input_line=0x4194ba0 "#line 1 "ROOT_prompt_0"n#include <LoKi/ParticleCuts.h>", compRes=
0x7ffd45beeafc: cling::Interpreter::kSuccess, result=0x7ffd45beeb00) at /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.04/src/ROOT/6.30.04/core/metacling/src/TCling.cxx:2436
===========================================================
Root > .q
ROOT version
❯ which root
/cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/bin/root
------------------------------------------------------------------
| Welcome to ROOT 6.30/04 https://root.cern |
| (c) 1995-2024, The ROOT Team; conception: R. Brun, F. Rademakers |
| Built for linuxx8664gcc on Feb 03 2024, 17:20:15 |
| From heads/master@tags/v6-30-04 |
| With g++ (GCC) 13.1.0 |
| Try '.help'/'.?', '.demo', '.license', '.credits', '.quit'/'.q' |
------------------------------------------------------------------
Installation method
LCG builds
Operating system
Linux (EL9)
Additional context
No response
A stripped down version of the header that segfaults:
// test.h
#include "LoKi/Particles.h"
using EQUALTO = LoKi::EqualToValue<const LHCb::Particle*>;
const auto TRTYPE = LoKi::Particles::TrackType{};
// Remove ANY of the lines below and the segfault disappears.
const auto ISDOWN = EQUALTO{TRTYPE, LHCb::Track::Types::Downstream};
const auto ISLONG = EQUALTO{TRTYPE, LHCb::Track::Types::Long};
const auto MUONBDT_CATBOOST = LoKi::Particles::MuonMVA2{};
const auto ISMUONPID = LoKi::Particles::IsMuon{};
const auto ISMUONLOOSE = LoKi::Particles::IsMuonLoose{};
const auto ISMUONTIGHT = LoKi::Particles::IsMuonTight{};
const auto ISUP = EQUALTO{TRTYPE, LHCb::Track::Types::Upstream};
const auto KEY = LoKi::Particles::Key{};
const auto M = LoKi::Particles::Mass{};
const auto LV01 = LoKi::Particles::DecayAngle{1};
const auto LV02 = LoKi::Particles::DecayAngle{2};
const auto LV03 = LoKi::Particles::DecayAngle{3};
const auto LV04 = LoKi::Particles::DecayAngle{4};
const auto M0 = LoKi::Particles::Mass{};
const auto M1 = LoKi::Particles::InvariantMass{1};
const auto M12 = LoKi::Particles::InvariantMass{1, 2};
const auto M13 = LoKi::Particles::InvariantMass{1, 3};
const auto M14 = LoKi::Particles::InvariantMass{1, 4};
const auto M2 = LoKi::Particles::InvariantMass{2};
const auto M23 = LoKi::Particles::InvariantMass{2, 3};
const auto M24 = LoKi::Particles::InvariantMass{2, 4};
const auto M34 = LoKi::Particles::InvariantMass{3, 4};
const auto MM = LoKi::Particles::MeasuredMass{};
I met a similar situation here. I am using LCG_103.
*** Break *** segmentation violation
===========================================================
There was a crash.
This is the entire stack trace of all threads:
===========================================================
#0 0x00007f62c74d89fa in wait4 () from /lib64/libc.so.6
#1 0x00007f62c744b243 in do_system () from /lib64/libc.so.6
#2 0x00007f62c570fb69 in TUnixSystem::StackTrace() () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-opt/lib/libCore.so
#3 0x00007f62c5edf463 in (anonymous namespace)::TExceptionHandlerImp::HandleException(int) () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-opt/lib/libcppyy_backend3_9.so
#4 0x00007f62c570f391 in TUnixSystem::DispatchSignals(ESignals) () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-opt/lib/libCore.so
#5 <signal handler called>
#6 0x00007f62be7d56fe in ?? ()
#7 0x00007ffce112d920 in ?? ()
#8 0x00007f62be7dc429 in ?? ()
#9 0x00007ffce112d950 in ?? ()
#10 0x00007f62be7d26b0 in ?? ()
#11 0x00007f62bacd3180 in ?? ()
#12 0x00007ffce112d940 in ?? ()
#13 0x00007ffce112d9a0 in ?? ()
#14 0x00007f62be7d985d in ?? ()
#15 0x000000000214fc80 in ?? ()
#16 0x00007f62be7d26b0 in ?? ()
#17 0x000000001d59e690 in ?? ()
#18 0x00007f62bf3940ec in (anonymous namespace)::local_cxa_atexit(void (*)(void*), void*, cling::Interpreter*) () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-opt/lib/libCling.so
#19 0x00007ffce112d940 in ?? ()
#20 0x00007f62bacd5778 in ?? ()
#21 0x00007f62bacd5740 in ?? ()
#22 0x0000000016991e30 in ?? ()
#23 0x00007f62be7d298d in ?? ()
#24 0x0000000000000000 in ?? ()
===========================================================
The lines below might hint at the cause of the crash. If you see question
marks as part of the stack trace, try to recompile with debugging information
enabled and export CLING_DEBUG=1 environment variable before running.
You may get help by asking at the ROOT forum https://root.cern/forum
preferably using the command (.forum bug) in the ROOT prompt.
Only if you are really convinced it is a bug in ROOT then please submit a
report at https://root.cern/bugs or (preferably) using the command (.gh bug) in
the ROOT prompt. Please post the ENTIRE stack trace
from above as an attachment in addition to anything else
that might help us fixing this issue.
===========================================================
#6 0x00007f62be7d56fe in ?? ()
#7 0x00007ffce112d920 in ?? ()
#8 0x00007f62be7dc429 in ?? ()
#9 0x00007ffce112d950 in ?? ()
#10 0x00007f62be7d26b0 in ?? ()
#11 0x00007f62bacd3180 in ?? ()
#12 0x00007ffce112d940 in ?? ()
#13 0x00007ffce112d9a0 in ?? ()
#14 0x00007f62be7d985d in ?? ()
#15 0x000000000214fc80 in ?? ()
#16 0x00007f62be7d26b0 in ?? ()
#17 0x000000001d59e690 in ?? ()
#18 0x00007f62bf3940ec in (anonymous namespace)::local_cxa_atexit(void (*)(void*), void*, cling::Interpreter*) () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-opt/lib/libCling.so
#19 0x00007ffce112d940 in ?? ()
#20 0x00007f62bacd5778 in ?? ()
#21 0x00007f62bacd5740 in ?? ()
#22 0x0000000016991e30 in ?? ()
#23 0x00007f62be7d298d in ?? ()
#24 0x0000000000000000 in ?? ()
===========================================================
Can you run valgrind using the root suppression file?
Can you run valgrind using the root suppression file?
Hi, here is the result. Do note that I do not know valgrind very well.
I ran this:
valgrind -v --leak-check=full --show-leak-kinds=all --suppressions=/afs/cern.
ch/work/r/ruide/valgrind-root.supp ls -l ./run gaudirun.py /afs/cern.ch/work/r/ruide/private/starterkit/ntuple_o
ptions.py
The result looks like this:
==930722== HEAP SUMMARY:
==930722== in use at exit: 20,459 bytes in 11 blocks
==930722== total heap usage: 754 allocs, 743 frees, 93,819 bytes allocated
==930722==
==930722== Searching for pointers to 11 not-freed blocks
==930722== Checked 132,624 bytes
==930722==
==930722== 24 bytes in 1 blocks are still reachable in loss record 1 of 9
==930722== at 0x484480F: malloc (vg_replace_malloc.c:442)
==930722== by 0x1164DF: ??? (in /usr/bin/ls)
==930722== by 0x11657C: ??? (in /usr/bin/ls)
==930722== by 0x1178C7: ??? (in /usr/bin/ls)
==930722== by 0x10D35A: ??? (in /usr/bin/ls)
==930722== by 0x48D958F: (below main) (in /usr/lib64/libc.so.6)
==930722==
==930722== 24 bytes in 1 blocks are still reachable in loss record 2 of 9
==930722== at 0x484480F: malloc (vg_replace_malloc.c:442)
==930722== by 0x11661F: ??? (in /usr/bin/ls)
==930722== by 0x117A3E: ??? (in /usr/bin/ls)
==930722== by 0x10D35A: ??? (in /usr/bin/ls)
==930722== by 0x48D958F: (below main) (in /usr/lib64/libc.so.6)
==930722==
==930722== 48 bytes in 1 blocks are still reachable in loss record 3 of 9
==930722== at 0x484480F: malloc (vg_replace_malloc.c:442)
==930722== by 0x115ECB: ??? (in /usr/bin/ls)
==930722== by 0x10E074: ??? (in /usr/bin/ls)
==930722== by 0x48D958F: (below main) (in /usr/lib64/libc.so.6)
==930722==
==930722== 54 bytes in 2 blocks are still reachable in loss record 4 of 9
==930722== at 0x484480F: malloc (vg_replace_malloc.c:442)
==930722== by 0x494C12E: strdup (in /usr/lib64/libc.so.6)
==930722== by 0x48929FF: selinux_raw_to_trans_context (in /usr/lib64/libselinux.so.1)
==930722== by 0x4892ADB: lgetfilecon (in /usr/lib64/libselinux.so.1)
==930722== by 0x117464: ??? (in /usr/bin/ls)
==930722== by 0x10D35A: ??? (in /usr/bin/ls)
==930722== by 0x48D958F: (below main) (in /usr/lib64/libc.so.6)
==930722==
==930722== 56 bytes in 1 blocks are still reachable in loss record 5 of 9
==930722== at 0x484480F: malloc (vg_replace_malloc.c:442)
==930722== by 0x115DC8: ??? (in /usr/bin/ls)
==930722== by 0x115DF9: ??? (in /usr/bin/ls)
==930722== by 0x10E549: ??? (in /usr/bin/ls)
==930722== by 0x48D958F: (below main) (in /usr/lib64/libc.so.6)
==930722==
==930722== 56 bytes in 1 blocks are still reachable in loss record 6 of 9
==930722== at 0x484480F: malloc (vg_replace_malloc.c:442)
==930722== by 0x115DC8: ??? (in /usr/bin/ls)
==930722== by 0x115DF9: ??? (in /usr/bin/ls)
==930722== by 0x10D1A2: ??? (in /usr/bin/ls)
==930722== by 0x48D958F: (below main) (in /usr/lib64/libc.so.6)
==930722==
==930722== 69 bytes in 2 blocks are still reachable in loss record 7 of 9
==930722== at 0x484480F: malloc (vg_replace_malloc.c:442)
==930722== by 0x11666E: ??? (in /usr/bin/ls)
==930722== by 0x116CAF: ??? (in /usr/bin/ls)
==930722== by 0x10D35A: ??? (in /usr/bin/ls)
==930722== by 0x48D958F: (below main) (in /usr/lib64/libc.so.6)
==930722==
==930722== 128 bytes in 1 blocks are still reachable in loss record 8 of 9
==930722== at 0x484480F: malloc (vg_replace_malloc.c:442)
==930722== by 0x113989: ??? (in /usr/bin/ls)
==930722== by 0x10D2A6: ??? (in /usr/bin/ls)
==930722== by 0x48D958F: (below main) (in /usr/lib64/libc.so.6)
==930722==
==930722== 20,000 bytes in 1 blocks are still reachable in loss record 9 of 9
==930722== at 0x484480F: malloc (vg_replace_malloc.c:442)
==930722== by 0x115DC8: ??? (in /usr/bin/ls)
==930722== by 0x10D310: ??? (in /usr/bin/ls)
==930722== by 0x48D958F: (below main) (in /usr/lib64/libc.so.6)
==930722==
==930722== LEAK SUMMARY:
==930722== definitely lost: 0 bytes in 0 blocks
==930722== indirectly lost: 0 bytes in 0 blocks
==930722== possibly lost: 0 bytes in 0 blocks
==930722== still reachable: 20,459 bytes in 11 blocks
==930722== suppressed: 0 bytes in 0 blocks
==930722==
==930722== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
I am surprised that valgrind is happy…
There's problem in the way the application was invoked: there a stray ls -l on the command line that make valgrind check ls and not gaudirun.py.
I tried to run Valgrind, but it only spots one small leak, despite the segfault.
But that made me look a bit better at the stack trace and I noticed the line
#37 0x00007f0602e677f7 in HandleInterpreterException (metaProcessor=0x308b020, input_line=0x4194ba0 "#line 1 "ROOT_prompt_0"n#include <LoKi/ParticleCuts.h>", compRes=
0x7ffd45beeafc: cling::Interpreter::kSuccess, result=0x7ffd45beeb00) at /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.04/src/ROOT/6.30.04/core/metacling/src/TCling.cxx:2436
I also tried to put the line #include <LoKi/ParticleCuts.h> into a small file test.C and invoke root test.C... no segfault, but an error from cling that complains about redefinition of symbols.
Tomorrow I'll investigate this new path, as it might be that the segfault is a red herring (hiding the actual problem in my code).
The file /cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1519 doesn't exist/was removed. I tried to reproduce the error with 1529 by loading the stripped down header file (root test.h)instead and I now get a different error message.
Processing temp.h...
In file included from input_line_8:1:
In file included from /afs/cern.ch/user/d/dvalapar/temp.h:2:
In file included from /cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/Phys/InstallArea/x86_64_v2-el9-gcc13-dbg/include/LoKi/Particles.h:20:
/cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/LHCb/InstallArea/x86_64_v2-el9-gcc13-dbg/include/Event/ProtoParticle.h:35:21: error: redefinition of 'CLID_ProtoParticle'
static const CLID CLID_ProtoParticle = 803;
^
input_line_10:1:10: note: '/cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/LHCb/InstallArea/x86_64_v2-el9-gcc13-dbg/include/Event/ProtoParticle.h' included multiple times, additional include site here
#include "/cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/LHCb/InstallArea/x86_64_v2-el9-gcc13-dbg/include/Event/ProtoParticle.h"
^
/cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/Phys/InstallArea/x86_64_v2-el9-gcc13-dbg/include/LoKi/Particles.h:20:10: note: '/cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/LHCb/InstallArea/x86_64_v2-el9-gcc13-dbg/include/Event/ProtoParticle.h' included multiple
times, additional include site here
#include "Event/ProtoParticle.h"
^
...
...SKIPPED LINES
...
/cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/LHCb/InstallArea/x86_64_v2-el9-gcc13-dbg/include/Event/ProtoParticle.h:55:9: error: redefinition of 'ProtoParticle'
class ProtoParticle final : public KeyedObject<int> {
^
input_line_10:1:10: note: '/cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/LHCb/InstallArea/x86_64_v2-el9-gcc13-dbg/include/Event/ProtoParticle.h' included multiple times, additional include site here
#include "/cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/LHCb/InstallArea/x86_64_v2-el9-gcc13-dbg/include/Event/ProtoParticle.h"
^
/cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/Phys/InstallArea/x86_64_v2-el9-gcc13-dbg/include/LoKi/Particles.h:20:10: note: '/cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/LHCb/InstallArea/x86_64_v2-el9-gcc13-dbg/include/Event/ProtoParticle.h' included multiple
times, additional include site here
#include "Event/ProtoParticle.h"
^
In file included from input_line_8:1:
In file included from /afs/cern.ch/user/d/dvalapar/temp.h:2:
In file included from /cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/Phys/InstallArea/x86_64_v2-el9-gcc13-dbg/include/LoKi/Particles.h:20:
/cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/LHCb/InstallArea/x86_64_v2-el9-gcc13-dbg/include/Event/ProtoParticle.h:326:24: error: redefinition of 'operator<<'
inline std::ostream& operator<<( std::ostream& s, LHCb::ProtoParticle::additionalInfo e ) {
^
/cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/LHCb/InstallArea/x86_64_v2-el9-gcc13-dbg/include/Event/ProtoParticle.h:326:24: note: previous definition is here
inline std::ostream& operator<<( std::ostream& s, LHCb::ProtoParticle::additionalInfo e ) {
^
root.exe: /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.04/src/ROOT/6.30.04/core/metacling/src/TCling.cxx:2200: virtual void TCling::RegisterModule(const char*, const char**, const char**, const char*, const char*, void (*)(), const TInterpreter::FwdDeclArg
sToKeepCollection_t&, const char**, Bool_t, Bool_t): Assertion `cling::Interpreter::kSuccess == compRes && "The forward declarations could not be compiled"' failed.
The error seems weird because I see #pragma once in ProtoParticle.h
Yes I noticed that as well. I switched to 1533 and got the same seg fault.
The file
/cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1519doesn't exist/was removed. I tried to reproduce the error with1529by loading the stripped down header file (root test.h)instead and I now get a different error message.Processing temp.h... In file included from input_line_8:1: In file included from /afs/cern.ch/user/d/dvalapar/temp.h:2: In file included from /cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/Phys/InstallArea/x86_64_v2-el9-gcc13-dbg/include/LoKi/Particles.h:20: /cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/LHCb/InstallArea/x86_64_v2-el9-gcc13-dbg/include/Event/ProtoParticle.h:35:21: error: redefinition of 'CLID_ProtoParticle' static const CLID CLID_ProtoParticle = 803; ^ input_line_10:1:10: note: '/cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/LHCb/InstallArea/x86_64_v2-el9-gcc13-dbg/include/Event/ProtoParticle.h' included multiple times, additional include site here #include "/cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/LHCb/InstallArea/x86_64_v2-el9-gcc13-dbg/include/Event/ProtoParticle.h" ^ /cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/Phys/InstallArea/x86_64_v2-el9-gcc13-dbg/include/LoKi/Particles.h:20:10: note: '/cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/LHCb/InstallArea/x86_64_v2-el9-gcc13-dbg/include/Event/ProtoParticle.h' included multiple times, additional include site here #include "Event/ProtoParticle.h" ^ ... ...SKIPPED LINES ... /cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/LHCb/InstallArea/x86_64_v2-el9-gcc13-dbg/include/Event/ProtoParticle.h:55:9: error: redefinition of 'ProtoParticle' class ProtoParticle final : public KeyedObject<int> { ^ input_line_10:1:10: note: '/cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/LHCb/InstallArea/x86_64_v2-el9-gcc13-dbg/include/Event/ProtoParticle.h' included multiple times, additional include site here #include "/cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/LHCb/InstallArea/x86_64_v2-el9-gcc13-dbg/include/Event/ProtoParticle.h" ^ /cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/Phys/InstallArea/x86_64_v2-el9-gcc13-dbg/include/LoKi/Particles.h:20:10: note: '/cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/LHCb/InstallArea/x86_64_v2-el9-gcc13-dbg/include/Event/ProtoParticle.h' included multiple times, additional include site here #include "Event/ProtoParticle.h" ^ In file included from input_line_8:1: In file included from /afs/cern.ch/user/d/dvalapar/temp.h:2: In file included from /cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/Phys/InstallArea/x86_64_v2-el9-gcc13-dbg/include/LoKi/Particles.h:20: /cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/LHCb/InstallArea/x86_64_v2-el9-gcc13-dbg/include/Event/ProtoParticle.h:326:24: error: redefinition of 'operator<<' inline std::ostream& operator<<( std::ostream& s, LHCb::ProtoParticle::additionalInfo e ) { ^ /cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/LHCb/InstallArea/x86_64_v2-el9-gcc13-dbg/include/Event/ProtoParticle.h:326:24: note: previous definition is here inline std::ostream& operator<<( std::ostream& s, LHCb::ProtoParticle::additionalInfo e ) { ^ root.exe: /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.04/src/ROOT/6.30.04/core/metacling/src/TCling.cxx:2200: virtual void TCling::RegisterModule(const char*, const char**, const char**, const char*, const char*, void (*)(), const TInterpreter::FwdDeclArg sToKeepCollection_t&, const char**, Bool_t, Bool_t): Assertion `cling::Interpreter::kSuccess == compRes && "The forward declarations could not be compiled"' failed.The error seems weird because I see
#pragma onceinProtoParticle.h
Thank you for the note. I will try it again later!
There's problem in the way the application was invoked: there a stray
ls -lon the command line that make valgrind checklsand notgaudirun.py.
I have a small update, but no good news.
When trying to reproduce the segfault with a root test.C I get stuck in problems that seem related to bad handling of #pragma once and include guards. If I solve the include guards problems then I still get the segfault both via the interactive #include <LoKi/ParticleCuts.h> and root test.C.
I prepared small "reproducer" that should work on any RHEL9 equivalent machine with CVMFS and the HEP_OSlibs meta-rpm. See attached root-15511.tar.gz
Output of valgrind with the original report of just #include <LoKi/ParticleCuts.h>, replacing 1519 in the paths with 1529:
$ VALGRIND_LIB=/cvmfs/lhcb.cern.ch/lib/lcg/releases/valgrind/3.22.0-113bc/x86_64-el9-gcc13-dbg/libexec/valgrind/ valgrind --leak-check=full --suppressions=/cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/etc/valgrind-root.supp root.exe -q -e "#include <LoKi/ParticleCuts.h>"
==652727== Memcheck, a memory error detector
==652727== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==652727== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
==652727== Command: root.exe -q -e #include\ \<LoKi/ParticleCuts.h\>
==652727==
------------------------------------------------------------------
| Welcome to ROOT 6.30/04 https://root.cern |
| (c) 1995-2024, The ROOT Team; conception: R. Brun, F. Rademakers |
| Built for linuxx8664gcc on Feb 03 2024, 17:20:15 |
| From heads/master@tags/v6-30-04 |
| With g++ (GCC) 13.1.0 |
| Try '.help'/'.?', '.demo', '.license', '.credits', '.quit'/'.q' |
------------------------------------------------------------------
==652727== Conditional jump or move depends on uninitialised value(s)
==652727== at 0xB2BAFE3: llvm::ConstantExpr::getGetElementPtr(llvm::Type*, llvm::Constant*, llvm::ArrayRef<llvm::Value*>, bool, llvm::Optional<unsigned int>, llvm::Type*) (in /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so)
==652727== by 0xA9E4679: llvm::Evaluator::EvaluateBlock(llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::Instruction, false, false, void>, false, false>, llvm::BasicBlock*&, bool&) (in /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so)
==652727== by 0xA9E5D5C: llvm::Evaluator::EvaluateFunction(llvm::Function*, llvm::Constant*&, llvm::SmallVectorImpl<llvm::Constant*> const&) (in /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so)
==652727== by 0xA9E4F46: llvm::Evaluator::EvaluateBlock(llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::Instruction, false, false, void>, false, false>, llvm::BasicBlock*&, bool&) (in /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so)
==652727== by 0xA9E5D5C: llvm::Evaluator::EvaluateFunction(llvm::Function*, llvm::Constant*&, llvm::SmallVectorImpl<llvm::Constant*> const&) (in /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so)
==652727== by 0x94CD322: EvaluateStaticConstructor(llvm::Function*, llvm::DataLayout const&, llvm::TargetLibraryInfo*) (in /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so)
==652727== by 0xA9D2A88: llvm::optimizeGlobalCtorsList(llvm::Module&, llvm::function_ref<bool (llvm::Function*)>) (in /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so)
==652727== by 0x94D4887: (anonymous namespace)::GlobalOptLegacyPass::runOnModule(llvm::Module&) (in /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so)
==652727== by 0xB392769: llvm::legacy::PassManagerImpl::run(llvm::Module&) (in /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so)
==652727== by 0x77E6523: cling::IncrementalExecutor::runStaticInitializersOnce(cling::Transaction&) (in /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so)
==652727== by 0x7763697: cling::Interpreter::executeTransaction(cling::Transaction&) (in /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so)
==652727== by 0x77F6B49: cling::IncrementalParser::commitTransaction(llvm::PointerIntPair<cling::Transaction*, 2u, cling::IncrementalParser::EParseResult, llvm::PointerLikeTypeTraits<cling::Transaction*>, llvm::PointerIntPairInfo<cling::Transaction*, 2u, llvm::PointerLikeTypeTraits<cling::Transaction*> > >&, bool) (in /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so)
==652727==
==652727== Invalid read of size 8
==652727== at 0x40703A7E: ???
==652727== by 0x40704B0C: ???
==652727== by 0x40700094: ???
==652727== by 0x406FFF1F: ???
==652727== by 0x8DEE881: (anonymous namespace)::GenericLLVMIRPlatformSupport::initialize(llvm::orc::JITDylib&) (in /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so)
==652727== by 0x77E65F2: cling::IncrementalExecutor::runStaticInitializersOnce(cling::Transaction&) (in /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so)
==652727== by 0x7763697: cling::Interpreter::executeTransaction(cling::Transaction&) (in /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so)
==652727== by 0x77F6B49: cling::IncrementalParser::commitTransaction(llvm::PointerIntPair<cling::Transaction*, 2u, cling::IncrementalParser::EParseResult, llvm::PointerLikeTypeTraits<cling::Transaction*>, llvm::PointerIntPairInfo<cling::Transaction*, 2u, llvm::PointerLikeTypeTraits<cling::Transaction*> > >&, bool) (in /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so)
==652727== by 0x77F9D97: cling::IncrementalParser::Compile(llvm::StringRef, cling::CompilationOptions const&) (in /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so)
==652727== by 0x77643DB: cling::Interpreter::DeclareInternal(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, cling::CompilationOptions const&, cling::Transaction**) const (in /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so)
==652727== by 0x7766985: cling::Interpreter::process(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, cling::Value*, cling::Transaction**, bool) (in /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so)
==652727== by 0x78491A6: cling::MetaProcessor::process(llvm::StringRef, cling::Interpreter::CompilationResult&, cling::Value*, bool) (in /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so)
==652727== Address 0xffffffffffffffe8 is not stack'd, malloc'd or (recently) free'd
==652727==
*** Break *** segmentation violation
(note that I ran it directly on root.exe, otherwise valgrind will only see the "wrapper" root executable that forks into root.exe)
With CLING_DEBUG=1, we can at least get a proper stack trace of where it's crashing:
$ CLING_DEBUG=1 root.exe
------------------------------------------------------------------
| Welcome to ROOT 6.30/04 https://root.cern |
| (c) 1995-2024, The ROOT Team; conception: R. Brun, F. Rademakers |
| Built for linuxx8664gcc on Feb 03 2024, 17:20:15 |
| From heads/master@tags/v6-30-04 |
| With g++ (GCC) 13.1.0 |
| Try '.help'/'.?', '.demo', '.license', '.credits', '.quit'/'.q' |
------------------------------------------------------------------
root [0] #include <LoKi/ParticleCuts.h>
*** Break *** segmentation violation
===========================================================
There was a crash (kSigSegmentationViolation).
This is the entire stack trace of all threads:
===========================================================
#0 0x00007ff1c12d89fa in wait4 () from /lib64/libc.so.6
#1 0x00007ff1c124b243 in do_system () from /lib64/libc.so.6
#2 0x00007ff1c1e59eb2 in TUnixSystem::Exec (this=0xf5b500, shellcmd=0x880a600 "/cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/etc/gdb-backtrace.sh 576550 1>&2") at /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.04/src/ROOT/6.30.04/core/unix/src/TUnixSystem.cxx:2120
#3 0x00007ff1c1e5a753 in TUnixSystem::StackTrace (this=0xf5b500) at /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.04/src/ROOT/6.30.04/core/unix/src/TUnixSystem.cxx:2411
#4 0x00007ff1c1e5e16c in TUnixSystem::DispatchSignals (this=0xf5b500, sig=kSigSegmentationViolation) at /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.04/src/ROOT/6.30.04/core/unix/src/TUnixSystem.cxx:3631
#5 0x00007ff1c1e560e0 in SigHandler (sig=kSigSegmentationViolation) at /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.04/src/ROOT/6.30.04/core/unix/src/TUnixSystem.cxx:402
#6 0x00007ff1c1e5e06f in sighandler (sig=11) at /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.04/src/ROOT/6.30.04/core/unix/src/TUnixSystem.cxx:3602
#7 0x00007ff1c1e47a32 in textinput::TerminalConfigUnix::HandleSignal (this=0x7ff1c2175d80 <textinput::TerminalConfigUnix::Get()::s>, signum=11) at /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.04/src/ROOT/6.30.04/core/textinput/src/textinput/TerminalConfigUnix.cpp:99
#8 0x00007ff1c1e47736 in (anonymous namespace)::TerminalConfigUnix__handleSignal (signum=11) at /build/jenkins/workspace/lcg_release_pipeline/build/projects/ROOT-6.30.04/src/ROOT/6.30.04/core/textinput/src/textinput/TerminalConfigUnix.cpp:36
#9 <signal handler called>
#10 0x00007ff1b033fa7e in LoKi::FunctorFromFunctor<LHCb::Particle const*, double>::FunctorFromFunctor (this=0x7ffcf3a95340, right=...) at /cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/LHCb/InstallArea/x86_64_v2-el9-gcc13-dbg/include/LoKi/Functor.h:114
#11 0x00007ff1b0340b0d in __cxx_global_var_initcling_module_10_.161(void) () at /cvmfs/lhcbdev.cern.ch/nightlies/lhcb-run2-patches/1529/Phys/InstallArea/x86_64_v2-el9-gcc13-dbg/include/LoKi/ParticleCuts.h:2880
#12 0x00007ff1b033c095 in __orc_init_func.cling-module-10 ()
#13 0x00007ff1bb91d882 in (anonymous namespace)::GenericLLVMIRPlatformSupport::initialize(llvm::orc::JITDylib&) () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so
#14 0x00007ff1ba3155f3 in cling::IncrementalExecutor::runStaticInitializersOnce(cling::Transaction&) () from /cvmfs/lhcb.cern.ch/lib/lcg/releases/ROOT/6.30.04-dd2db/x86_64-el9-gcc13-dbg/lib/libCling.so
@pikacic if LoKi::FunctorFromFunctor<LHCb::Particle const*, double>::FunctorFromFunctor rings a bell for you, please shout. That's what I will look into next...
Let me have a look.
Okay, never mind, this is a Cling issue: If there are more than 16 const variables with non-trivial constructors, their execution order may be scrambled:
extern "C" int printf(const char*, ...);
struct A {
int val;
A(int v) : val(v) {
printf("A(%d), this = %p\n", val, this);
}
~A() {
printf("~A(%d), this = %p\n", val, this);
}
};
const A a1(1);
const A a2(2);
const A a3(3);
const A a4(4);
const A a5(5);
const A a6(6);
const A a7(7);
const A a8(8);
const A a9(9);
const A a10(10);
const A a11(11);
const A a12(12);
const A a13(13);
const A a14(14);
const A a15(15);
const A a16(16);
const A a17(17);
This should print from 1 to 17, but for example master gives:
A(9), this = 0x7f9f2174e088
A(17), this = 0x7f9f2174e108
A(16), this = 0x7f9f2174e0f8
A(15), this = 0x7f9f2174e0e8
A(14), this = 0x7f9f2174e0d8
A(13), this = 0x7f9f2174e0c8
A(12), this = 0x7f9f2174e0b8
A(11), this = 0x7f9f2174e0a8
A(10), this = 0x7f9f2174e098
A(1), this = 0x7f9f2174e008
A(8), this = 0x7f9f2174e078
A(7), this = 0x7f9f2174e068
A(6), this = 0x7f9f2174e058
A(5), this = 0x7f9f2174e048
A(4), this = 0x7f9f2174e038
A(3), this = 0x7f9f2174e028
A(2), this = 0x7f9f2174e018
~A(2), this = 0x7f9f2174e018
~A(3), this = 0x7f9f2174e028
~A(4), this = 0x7f9f2174e038
~A(5), this = 0x7f9f2174e048
~A(6), this = 0x7f9f2174e058
~A(7), this = 0x7f9f2174e068
~A(8), this = 0x7f9f2174e078
~A(1), this = 0x7f9f2174e008
~A(10), this = 0x7f9f2174e098
~A(11), this = 0x7f9f2174e0a8
~A(12), this = 0x7f9f2174e0b8
~A(13), this = 0x7f9f2174e0c8
~A(14), this = 0x7f9f2174e0d8
~A(15), this = 0x7f9f2174e0e8
~A(16), this = 0x7f9f2174e0f8
~A(17), this = 0x7f9f2174e108
~A(9), this = 0x7f9f2174e088
(at least destruction order is consistent)
For the LHCb headers, this causes problems because some constructor calls reference other global const objects and the scrambled order means they are not constructed yet.
This seems to be caused by https://github.com/root-project/root/pull/13614, which was meant to fix https://github.com/root-project/root/issues/13429, and therefore affects v6.28/08, where it was backported, and later versions (all v6.30, v6.32, master). I'll try to understand why the order starts changing with more than 16 const variables and work on a fix next.
This seems to be caused by #13614, which was meant to fix #13429, and therefore affects v6.28/08, where it was backported, and later versions (all v6.30, v6.32,
master). I'll try to understand why the order starts changing with more than 16constvariables and work on a fix next.
It turns out there should be a llvm::stable_sort instead of llvm::sort to preserve order between constructors with the same priority. With 16 const variables, we are lucky - maybe because it switches to a different sorting algorithm below a threshold? I submitted an upstream LLVM fix: https://github.com/llvm/llvm-project/pull/95532 and will work on applying it to all affected ROOT versions.
Awesome! Thanks for the detailed analysis and a fix!
Thanks a lot!
As requested by LHCb, a release for the 6.30 branch including the fix was provided today: https://root-forum.cern.ch/t/root-6-30-08-is-out