elastix
elastix copied to clipboard
elastix 4.9 static libary version not thread safe
I was using elastix 4.9 static library version. I created two threads to run registrations simultaneously, one with 3d registration and the other with 2d registration. And the result was either elastix crashes the program or it produced wrong registered results. Howerver, if i run the two threads sequentially, the output result was ok. I guess the issue might be resulted from some static varibles used internally by elastix. Can someone investigate this issue?
This is indeed a known issue. I was under the impression it was (partially) fixed in the develop branch? @N-Dekker do you know?
@mstaring @N-Dekker Is the issue resolved for elastix 4.9? Because I am using elastix 5.0 as static library and it seems not to be thread-safe. I have tried both the master branch (Release/Debug) and the develop branch (Release) but the issue remains. I am on Windows 10 with VS2017 if it is relevant. Thanks in advance.
It seems that there are two (at least) groups of things that are making elastix not thread safe. After certain modifications in the current develop branch, I was able to run, finally, registrations in parallel at user level. Probably, the two groups are known but I add them here for the discussion:
The static ComponentDatabasePointer
& ComponentLoaderPointer
(link). I assume they are static so that the components are not loaded/installed if the user provides more than one parameterMaps at this loop. Is there maybe another reason that they are static? I turned them to normal members of the ElastixMain
class to make it work. Of course, a mutex at init could also work if they need to be static.
The second group is more-or-less anything related with the logger and I guess a proper solution would be quite involved. What I did was to simply deactivate any function that was performing write operation. Of course, this means that the logger was not working (not a problem for the application I am working on though). Could that be a possible temporary option? When the user has set LogToConsoleOff()
and LogToFileOff()
then the logger is deactivated and then running in parallel can work.
I am looking forward to hearing your thoughts. By the way, please let me know if there is any other datarace/thread-safety issue that I haven't spotted. Thanks!
Interesting suggestions @ntatsisk So a possibility could be to have multiple ElastixMain
objects running in parallel, each having their own (non-static) ComponentDatabase. Do I understand correctly that each ElastixMain
object would then internally have a ComponentDatabase with exactly the same content?
Thanks for the reply, @N-Dekker! Yes, in the naive option that I followed each ElastixMain
object has its own ComponentDatabase
with exactly the same content. However, even in the application that I am working on which involves 600 rigid registrations using 24 threads required to run in under 10 sec, I didn't notice any extreme memory consumption and the speed goal was also achieved.
Now, maybe a better solution would be to keep both ComponentDatabase
and ComponentLoader
static but add a mutex during their initialization (loading of the components) because that is the point that I noticed there are data races. After that loading I think both objects are only read so no issue there. Maybe a good starting point for the mutex would be this one but I haven't tried that.
@ntatsisk If the naive option of having a ComponentDatabase
for each ElastixMain object does not really affect the performance, it may be worth considering. But then I'm still wondering if it would break any code, to no longer being able to call ElastixMain ::GetComponentDatabase()
as a static function. What d you think?
If both ComponentDatabase and ComponentLoader should still remain static anyway, maybe we could make use of the C++11 feature "magic statics". From C++11, the initialization of a local static variable is guaranteed thread-safe (no need to use a mutex) 😃 http://www.nuonsoft.com/blog/2017/08/10/implementing-a-thread-safe-singleton-with-c11-using-magic-statics
@N-Dekker, from a quick search in the elastix codebase I see that ElastixMain::GetComponentDatabase()
(as well as its Set counterpart) are called limited times and always through an ElastixMain
object. So, I don't see why they were static in the first place since a non-static function can return a static member variable. Of course, I have limited experience with the elastix source code and maybe I am missing something ;)
However I think we should still consider keeping the ComponentDatabase and ComponentLoader static combined with a singleton (not just a mutex that I mentioned previously). Btw, I didn't know about the magic statics, great feature! How could this be implemented? A vague idea is using only one singleton for the ComponentLoader which should initialize the ComponentDatabase and do the ComponentLoader::LoadComponents
functionality in its constructor. A bonus is that it would also remove the need for LoadComponents
and UnloadComponents
functions which is against the RAII principle. I could give that a try. Any different ideas?
Thanks @ntatsisk I was just considering to call ElastixMain::GetComponentDatabase() myself, in a context where I may not have an ElastixMain object. But I have to have another look.
A bonus is that it would also remove the need for LoadComponents and UnloadComponents functions which is against the RAII principle.
While I'm a big fan of the RAII principle, I don't care so much about ComponentLoader::UnloadComponents()
being called properly, as it has been empty for more than a decade now 😸 From commit c09357417b0feb33d86812035c255e26473dfd38 by Stefan (@stefanklein), 12 Feb 2008: https://github.com/SuperElastix/elastix/blob/c09357417b0feb33d86812035c255e26473dfd38/src/Core/Install/elxComponentLoader.cxx#L153-L162
Do you agree that it's OK not to worry about ComponentLoader::UnloadComponents()
?
Wow, this feels like a time machine :)
From: Niels Dekker [email protected] Sent: Tuesday, December 22, 2020 12:58 PM To: SuperElastix/elastix [email protected] Cc: S. Klein [email protected]; Mention [email protected] Subject: Re: [SuperElastix/elastix] elastix 4.9 static libary version not thread safe (#174)
Thanks @ntatsiskhttps://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fntatsisk&data=04%7C01%7Cs.klein%40erasmusmc.nl%7C5ebdb9c33db04e0b485008d8a670d30d%7C526638ba6af34b0fa532a1a511f4ac80%7C0%7C0%7C637442350815728248%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=IEmi%2BAba1%2Ftq8PBFtJxZ8VUOdb0TU8dPA2PrYsfMmFU%3D&reserved=0 I was just considering to call ElastixMain::GetComponentDatabase() myself, in a context where I may not have an ElastixMain object. But I have to have another look.
A bonus is that it would also remove the need for LoadComponents and UnloadComponents functions which is against the RAII principle.
While I'm a big fan of the RAII principle, I don't care so much about ComponentLoader::UnloadComponents() being called properly, as it has been empty for more than a decade now 😸 From commit c093574https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FSuperElastix%2Felastix%2Fcommit%2Fc09357417b0feb33d86812035c255e26473dfd38&data=04%7C01%7Cs.klein%40erasmusmc.nl%7C5ebdb9c33db04e0b485008d8a670d30d%7C526638ba6af34b0fa532a1a511f4ac80%7C0%7C0%7C637442350815728248%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=TdQ6g%2FYIMRuNLNW5EX8ITz%2F7Q2SCD2rNgLWDjWHPhq8%3D&reserved=0 by Stefan (@stefankleinhttps://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fstefanklein&data=04%7C01%7Cs.klein%40erasmusmc.nl%7C5ebdb9c33db04e0b485008d8a670d30d%7C526638ba6af34b0fa532a1a511f4ac80%7C0%7C0%7C637442350815738242%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=OIv5Kgl0vuYnvn8bGCbDYZb2sRL3rs%2FVLVi020QFXdk%3D&reserved=0), 12 Feb 2008: https://github.com/SuperElastix/elastix/blob/c09357417b0feb33d86812035c255e26473dfd38/src/Core/Install/elxComponentLoader.cxx#L153-L162https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FSuperElastix%2Felastix%2Fblob%2Fc09357417b0feb33d86812035c255e26473dfd38%2Fsrc%2FCore%2FInstall%2FelxComponentLoader.cxx%23L153-L162&data=04%7C01%7Cs.klein%40erasmusmc.nl%7C5ebdb9c33db04e0b485008d8a670d30d%7C526638ba6af34b0fa532a1a511f4ac80%7C0%7C0%7C637442350815738242%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=uJqqfsrkBM9ZW2MCThkhEM393n7zafChxMM2ZOZ6ufM%3D&reserved=0
Do you agree that it's OK not to worry about ComponentLoader::UnloadComponents()?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FSuperElastix%2Felastix%2Fissues%2F174%23issuecomment-749505067&data=04%7C01%7Cs.klein%40erasmusmc.nl%7C5ebdb9c33db04e0b485008d8a670d30d%7C526638ba6af34b0fa532a1a511f4ac80%7C0%7C0%7C637442350815748237%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=sP7L89uApoaEwVNuifIW6X7AqzR%2FrzpmENDFZeLfpm0%3D&reserved=0, or unsubscribehttps://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAF2LNIHPD2Z4PCY2HXMNJTSWCCUDANCNFSM4IMFJPEA&data=04%7C01%7Cs.klein%40erasmusmc.nl%7C5ebdb9c33db04e0b485008d8a670d30d%7C526638ba6af34b0fa532a1a511f4ac80%7C0%7C0%7C637442350815748237%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Ofv%2BBHKQMBrsLHSOcoq6qbjCPlLfX7Hz2rX9HXp4oxo%3D&reserved=0.
@stefanklein Great flashback from the elastix past indeed! ;)
@N-Dekker Sorry, I was not explicit. I was mainly worried about the ElastixMain::UnloadComponents
:
https://github.com/SuperElastix/elastix/blob/3acd58e5efbe820305f4ba696bdaab4658a39861/Core/Kernel/elxElastixMain.cxx#L760-L776
But I think that this is a slightly different discussion since it could be fixed in either case (static or not). I think that discussion goes in favor of keeping ComponentLoader and ComponentDatabase static and working around that, right? That way you will also have the option to call ElastixMain::GetComponentDatabase()
on its own without an object.
@ntatsisk Thanks for your explanation regarding UnloadComponents()
. In general I'm reluctant to introduce static
data or singletons. I'm aware of the Singleton being a controversial design pattern, or "anti-pattern": https://en.wikipedia.org/wiki/Singleton_pattern#Anti-Pattern_considerations But now that s_CDB
and s_ComponentLoader
are static already, maybe we should first try if we can just fix the thread-safety issue, while keeping them static... albeit "magic static" 😸
@ntatsisk I just did a first attempt (draft) to fix the issue, at https://github.com/SuperElastix/elastix/tree/ElastixMain-LoadComponents-thread-safe Is it similar to what you have in mind? Or did you already prepare a fix as well?
My first attempt did use C++11 "magic statics" to ensure that the components are loaded exactly once (in a thread safe manner). But then they are only unloaded at the end of the program (automatically). I hope that's OK!
Also there are still some test failures at the CI! https://github.com/SuperElastix/elastix/runs/1678230364 and https://dev.azure.com/kaspermarstal/Elastix/_build/results?buildId=1526&view=logs&j=34a65711-f108-577e-6707-abb292f048dd&t=8c37f5e7-baf8-54ce-925d-207c0d622869
@stefanklein @mstaring Is it important to be able to explicitly unload the components any time elastix is running? If so, for what particular use case?
@ntatsisk The component database should now be thread-safe, with "develop" branch commit e5b36582b6d9f6a281dd3a52c2ac70d37b973772. I made a first attempt to ensure that the management of output streams (opening and closing of streams) is also thread-safe, at a new branch, Share-xoutManager. If you have the time, please have a look. The idea is that threads may share the "output manager", so that only the first thread opens the streams, while only the last thread closes them.
Hi @N-Dekker, thank you for your work so far on this issue and now continuing with the logger! I have started looking at the implementation details of the current logger as well as the changes that you proposed in the new branch. However, I have a question (maybe basic): The resetGuard
seems to belong the local scope of the GetSharedManager
function. Doesn't that mean that the returned shared pointer will always be reset when the guard goes out of scope (aka the function always returns nullptr)? What is the idea behind the ResetGuard
?
I have started looking at the implementation details of the current logger as well as the changes that you proposed in the new branch. However, I have a question (maybe basic): The
resetGuard
seems to belong the local scope of theGetSharedManager
function. Doesn't that mean that the returned shared pointer will always be reset when the guard goes out of scope (aka the function always returns nullptr)? What is the idea behind theResetGuard
?
Thanks for having a look at my possible "shared xout manager" solution!
First of all I should say that this may not be the final solution. Marius (@mstaring) suggested that each elastix-filter could have its own xout-object, instead of having a global xout-object that is shared between the filter instances.
Anyway, the intention of xoutManager::GetSharedManager
is that all filter instances would share the same "manager object". But then, the manager object should be destroyed automatically, when all filters are finished. If GetSharedManager
would just create a local static shared_ptr<xoutManager>
object by new xoutManager(...)
and return the shared pointer, it would never be destructed anymore! The reference count of the shared pointer would never get back to zero!
This is what I try to avoid by doing a reset()
when the function goes out of scope. So that only the callers (the filters) keep the manager alive.
Anyway, even if this is a proper solution for the lifetime (creation and destruction) of xout
, it does not yet solve concurrent access to the output streams. To be continued...
@ntatsisk Would a solution that requires switching off output to TransformParameters.txt be acceptable (or useful) to you? By configuration option (WriteFinalTransformParameters "false")
also, please check that the output directory has been set to different directories for each instance of elastix.
@N-Dekker Indeed, that would be useful. We only use the transformation parameter values internally in the code and we don't output anything to file/console (LogToConsoleOff()
& LogToFileOff()
).
@ntatsisk FYI I just finished my task to ensure that all parameters that are written to TransformParameter.txt files are also available in memory, in transform parameter maps. I guess that might be helpful to you as well 😃 (Now merged onto the develop branch: pull request #402, #395, #385)