FairRoot icon indicating copy to clipboard operation
FairRoot copied to clipboard

FileSink is not closed if FairRunOnline::Run() is skipped and never closed for FairRunAna

Open YanzhaoW opened this issue 7 months ago • 7 comments

Describe the bug FileSink is not closed when FairRunOnline::Run() function is skipped resulting in program crushing when ROOT starts cleaning up variables. (Never closed for FairRunAna)

To Reproduce Use any examples with FileSink and comment out the FairRun::Run() calling.

Causes

For FairRunOnline:

https://github.com/FairRootGroup/FairRoot/blob/fa7108b45684c729c2e88b12d7fbf4bbb5a4dc65/online/steer/FairRunOnline.cxx#L338-L354

For FairRunAna:

https://github.com/FairRootGroup/FairRoot/blob/fa7108b45684c729c2e88b12d7fbf4bbb5a4dc65/base/steer/FairRunAna.cxx#L599-L611

This function is never called.

Similar for FairRunSim.

Logs / Screenshots

Here is the stacktrace if anyone is interested
#0  0x00007f7527d3f2d7 in __GI___waitpid (pid=28130, stat_loc=stat_loc
entry=0x7ffc17b31a08, options=options
entry=0) at ../sysdeps/unix/sysv/linux/waitpid.c:30
#1  0x00007f7527cbd70f in do_system (line=<optimized out>) at ../sysdeps/posix/system.c:149
#2  0x00007f75296f6913 in TUnixSystem::Exec (shellcmd=<optimized out>, this=0x55ae1e3537c0) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/core/unix/src/TUnixSystem.cxx:2104
#3  TUnixSystem::StackTrace (this=0x55ae1e3537c0) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/core/unix/src/TUnixSystem.cxx:2395
#4  0x00007f75296f8c55 in TUnixSystem::DispatchSignals (this=0x55ae1e3537c0, sig=kSigSegmentationViolation) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/core/unix/src/TUnixSystem.cxx:3615
#5  <signal handler called>
#6  0x00007f7527cfc0af in _int_malloc (av=av
entry=0x7f7527e33c40 <main_arena>, bytes=bytes
entry=64) at malloc.c:3678
#7  0x00007f7527cfd5a3 in __GI___libc_malloc (bytes=64) at malloc.c:3060
#8  0x00007f752806afd8 in operator new(unsigned long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#9  0x00007f7522864ab9 in std::_Deque_base<ClingMemberIterInternal::UsingDeclIter::UsingDeclFrame, std::allocator<ClingMemberIterInternal::UsingDeclIter::UsingDeclFrame> >::_M_initialize_map(unsigned long) () from /cvmfs/fairsoft.gsi.de/debian10/fairsoft/nov22p1/lib/libCling.so
#10 0x00007f75228114f9 in TCling::DataMemberInfo_FactoryCopy(DataMemberInfo_t*) const () from /cvmfs/fairsoft.gsi.de/debian10/fairsoft/nov22p1/lib/libCling.so
#11 0x00007f75296bceb8 in TListOfDataMembers::Get (this=0x55ae1e459030, info=0x55ae20c91a80, skipChecks=<optimized out>) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/core/meta/inc/TClass.h:433
#12 0x00007f75296bd17a in TListOfDataMembers::Load (this=0x55ae1e459030) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/core/meta/src/TListOfDataMembers.cxx:474
#13 0x00007f75296964b8 in TClass::CreateListOfDataMembers (this=0x55ae20923ab0, data=..., selection=TDictionary::EMemberSelection::kNoUsingDecls, load=<optimized out>) at /usr/include/c++/8/bits/atomic_base.h:707
#14 0x00007f75296a4bda in TClass::GetDataMember (this=0x55ae20923ab0, datamember=<optimized out>) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/core/meta/src/TClass.cxx:3442
#15 0x00007f75296a79de in TBuildRealData::Inspect (this=0x7ffc17b345e0, cl=0x55ae20923ab0, pname=0x55ae20bd7700 "", mname=0x7ffc17b344e0 "fBaseVersion", add=0xb8, isTransient=false) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/core/meta/src/TClass.cxx:773
#16 0x00007f752281e812 in TCling::InspectMembers(TMemberInspector&, void const*, TClass const*, bool) () from /cvmfs/fairsoft.gsi.de/debian10/fairsoft/nov22p1/lib/libCling.so
#17 0x00007f752969883d in TClass::CallShowMembers (isTransient=false, insp=..., obj=0x0, this=0x55ae20923ab0) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/core/meta/src/TClass.cxx:2223
#18 TClass::CallShowMembers (this=0x55ae20923ab0, obj=0x0, insp=..., isTransient=false) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/core/meta/src/TClass.cxx:2203
#19 0x00007f75296a69b5 in TClass::BuildRealData (this=0x55ae20923ab0, pointer=pointer
entry=0x0, isTransient=false) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/core/meta/src/TClass.cxx:2078
#20 0x00007f7528433248 in TStreamerInfo::Build (this=0x55ae20c90770, isTransient=<optimized out>) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/io/io/src/TStreamerInfo.cxx:307
#21 0x00007f75296975dc in TClass::GetStreamerInfoImpl (this=0x55ae20923ab0, version=3, silent=<optimized out>) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/core/meta/src/TClass.cxx:4655
#22 0x00007f752969775f in TClass::GetStreamerInfo (this=this
entry=0x55ae20923ab0, version=3, version
entry=0, isTransient=isTransient
entry=false) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/core/meta/src/TClass.cxx:4615
#23 0x00007f752838892c in TBufferFile::WriteClassBuffer (this=this
entry=0x55ae1e45b800, cl=0x55ae20923ab0, pointer=pointer
entry=0x55ae1fc2e0a0) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/io/io/src/TBufferFile.cxx:3522
#24 0x00007f75296d8668 in TStreamerBase::Streamer (this=0x55ae1fc2e0a0, R(bool)=...) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/core/meta/src/TStreamerElement.cxx:826
#25 0x00007f7528387ce7 in TClass::Streamer (onfile_class=0x0, b=..., obj=0x55ae1fc2e0a0, this=0x55ae20923ab0) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/core/meta/inc/TClass.h:609
#26 TBufferFile::WriteObjectClass (this=0x55ae1e45b800, actualObjectStart=0x55ae1fc2e0a0, actualClass=0x55ae20923ab0, cacheReuse=<optimized out>) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/io/io/src/TBufferFile.cxx:2553
#27 0x00007f752838e8fd in TBufferIO::WriteObjectAny (cacheReuse=<optimized out>, ptrClass=0x55ae1f1537a0, obj=0x55ae1fc2e0a0, this=0x55ae1e45b800) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/io/io/src/TBufferIO.cxx:519
#28 TBufferIO::WriteObjectAny (this=0x55ae1e45b800, obj=0x55ae1fc2e0a0, ptrClass=0x55ae1f1537a0, cacheReuse=<optimized out>) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/io/io/src/TBufferIO.cxx:492
#29 0x00007f752966eddd in operator<< <TObject> (obj=0x55ae1fc2e0a0, buf=...) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/core/meta/inc/TClass.h:650
#30 TObjArray::Streamer (this=0x7ffc17b34a90, b=...) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/core/cont/src/TObjArray.cxx:487
#31 0x00007f7528387ce7 in TClass::Streamer (onfile_class=0x0, b=..., obj=0x7ffc17b34a90, this=0x55ae1ed69240) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/core/meta/inc/TClass.h:609
#32 TBufferFile::WriteObjectClass (this=0x55ae1e45b800, actualObjectStart=0x7ffc17b34a90, actualClass=0x55ae1ed69240, cacheReuse=<optimized out>) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/io/io/src/TBufferFile.cxx:2553
#33 0x00007f752838e7eb in TBufferIO::WriteObjectAny (cacheReuse=false, ptrClass=0x55ae1ed69240, obj=0x7ffc17b34a90, this=0x55ae1e45b800) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/io/io/src/TBufferIO.cxx:522
#34 TBufferIO::WriteObjectAny (this=this
entry=0x55ae1e45b800, obj=obj
entry=0x7ffc17b34a90, ptrClass=0x55ae1ed69240, cacheReuse=cacheReuse
entry=false) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/io/io/src/TBufferIO.cxx:492
#35 0x00007f75284329ea in TStreamerInfo::Streamer (this=<optimized out>, R(bool)=...) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/io/io/src/TStreamerInfo.cxx:5369
#36 0x00007f7528387ce7 in TClass::Streamer (onfile_class=0x0, b=..., obj=0x55ae1fc2ded0, this=0x55ae1f959ec0) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/core/meta/inc/TClass.h:609
#37 TBufferFile::WriteObjectClass (this=0x55ae1e45b800, actualObjectStart=0x55ae1fc2ded0, actualClass=0x55ae1f959ec0, cacheReuse=<optimized out>) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/io/io/src/TBufferFile.cxx:2553
#38 0x00007f752838e8fd in TBufferIO::WriteObjectAny (cacheReuse=<optimized out>, ptrClass=0x55ae1f1537a0, obj=0x55ae1fc2ded0, this=0x55ae1e45b800) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/io/io/src/TBufferIO.cxx:519
#39 TBufferIO::WriteObjectAny (this=0x55ae1e45b800, obj=0x55ae1fc2ded0, ptrClass=0x55ae1f1537a0, cacheReuse=<optimized out>) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/io/io/src/TBufferIO.cxx:492
#40 0x00007f752966a473 in operator<< <TObject> (obj=0x55ae1fc2ded0, buf=...) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/core/meta/inc/TClass.h:650
#41 TList::Streamer (this=<optimized out>, b=...) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/core/cont/src/TList.cxx:1255
#42 0x00007f75284117dc in TKey::TKey (this=0x7ffc17b34e20, obj=0x7ffc17b34d40, name=0x7f75285d748a "StreamerInfo", bufsize=373, motherDir=<optimized out>) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/io/io/src/TKey.cxx:249
#43 0x00007f75283f3968 in TFile::WriteStreamerInfo (this=0x55ae20a4d500) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/io/io/src/TFile.cxx:3792
#44 0x00007f75283f2569 in TFile::Close (this=0x55ae20a4d500, option=0x7f75285d79f1 "") at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/io/io/src/TFile.cxx:920
#45 0x00007f75283f2b01 in TFile::~TFile (this=0x55ae20a4d500, __in_chrg=<optimized out>) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/io/io/src/TFile.cxx:527
#46 0x00007f75283f2de9 in TFile::~TFile (this=0x55ae20a4d500, __in_chrg=<optimized out>) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/io/io/src/TFile.cxx:525
#47 0x00007f7529667f20 in TList::Delete (this=0x55ae1e358700, option=<optimized out>) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/core/cont/src/TList.cxx:508
#48 0x00007f75295b5824 in TROOT::~TROOT (this=0x7f7529926780 <ROOT::Internal::GetROOT1()::alloc>, __in_chrg=<optimized out>) at /home/uhlig/containers/home/debian10/software/fairsoft/nov22p1/build/Source/root/core/base/src/TROOT.cxx:893
#49 0x00007f7527cb2ebc in __run_exit_handlers (status=0, listp=0x7f7527e33718 <__exit_funcs>, run_list_atexit=run_list_atexit
entry=true, run_dtors=run_dtors
entry=true) at exit.c:108
#50 0x00007f7527cb2fea in __GI_exit (status=<optimized out>) at exit.c:139
#51 0x00007f7527c9d0a2 in __libc_start_main (main=0x55ae1dc8e580 <main(int, char const**)>, argc=9, argv=0x7ffc17b352b8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffc17b352a8) at ../csu/libc-start.c:342
#52 0x000055ae1dc8ff6a in _start () at /usr/include/c++/8/bits/basic_string.h:2824

YanzhaoW avatar Nov 16 '23 01:11 YanzhaoW

I would guess, that the crash is due to FairRun* is not being destructed properly (because most examples have a auto run = new FairRun; instead of FairRun run{};). We probably should fix all the remaining examples that have that issue.

That said, we maybe should put some if(fSink) { fSink->Close(); } (and the same for fSource) in FairRun::~FairRun?

But all of that is more for the "unusual cleanup" scenario. FairRunAna not calling Close at all sounds like a real bug for the usual scenario!

ChristianTackeGSI avatar Nov 20 '23 19:11 ChristianTackeGSI

That said, we maybe should put some if(fSink) { fSink->Close(); } (and the same for fSource) in FairRun::~FairRun?

Yes, but just fSink->Close(). I don't understand what's the purpose of this if statement.

We also need to make sure sink::Close() is not called elsewhere.

YanzhaoW avatar Nov 20 '23 21:11 YanzhaoW

Yes, but just fSink->Close(). I don't understand what's the purpose of this if statement.

The if has a good reason: This should happen in the dtor of FairRun. And various use cases allow no sink or no source. So we should only call the Close if the sink/source actually exists.

We also need to make sure sink::Close() is not called elsewhere.

It's not nice if we call Close multiple times on a Sink/Source, but they should survive this! Like std::fstream can also be closed multiple times.

ChristianTackeGSI avatar Nov 21 '23 09:11 ChristianTackeGSI

Just one next thought:

Calling Close from dtor of FairRun means that FairRunSomething doesn't any longer exist (because its dtor has already ran). It only exists as a FairRun. So interacting from CloseImpl with FairRunSomething is not good!

ChristianTackeGSI avatar Nov 21 '23 10:11 ChristianTackeGSI

Just one next thought:

Calling Close from dtor of FairRun means that FairRunSomething doesn't any longer exist (because its dtor has already ran). It only exists as a FairRun. So interacting from CloseImpl with FairRunSomething is not good!

Yes, like I mentioned, whoever owns the the source should call CloseImpl. If we decide to let FairRun own it, FairRun should call it. If FairRunAna owns it, FairRunAna calls it.

YanzhaoW avatar Nov 21 '23 11:11 YanzhaoW

I think, FairRun should own it. See for example #1430.

ChristianTackeGSI avatar Nov 21 '23 14:11 ChristianTackeGSI

Okay, as a first suggest: #1467

ChristianTackeGSI avatar Nov 21 '23 14:11 ChristianTackeGSI