ITK icon indicating copy to clipboard operation
ITK copied to clipboard

ENH: Register/UnRegister only in SmartPointer Con/Destructor.

Open Leengit opened this issue 2 years ago • 36 comments

(This text has been edited)

Only in the constructors and destructors for the SmartPointer should we invoke the underlying object's Register() or UnRegister() methods to handle the smart pointer reference count. The original code had m_ReferenceCount initialized to 1 in the construction of an underlying object (despite that there were 0 smart pointers) and had multiple functions responsible for UnRegister() of that extra count after the created object was assigned to a SmartPointer.

Note that Register() and UnRegister() are okay and have been kept in functions that act as if they were SmartPointer constructors or destructors, such as those that pass raw C++ pointers to be used by Python code. Similarly they are kept in ObjectFactoryBase because, for backward API compatibility, lists of ObjectFactoryBase * are used instead of lists of ObjectFactoryBase::Pointer; and Register() and UnRegister() are used manually each time a pointer is inserted or removed from a list.

PR Checklist

  • [ ] No API changes were made (or the changes have been approved)
  • [ ] No major design changes were made (or the changes have been approved)
  • [X] Added test (or behavior not changed)
  • [X] Updated API documentation (or API not changed)
  • [X] Added license to new files (if any)
  • [X] Added Python wrapping to new files (if any) as described in ITK Software Guide Section 9.5
  • [X] Added ITK examples for all new major features (if any)

Leengit avatar Sep 18 '21 14:09 Leengit

The force push just now updates the commit message.

Leengit avatar Sep 18 '21 14:09 Leengit

@Leengit Interesting 👍 Can you please be more specific about:

This original use supported sloppy code where a pointer was used even after the last SmartPointer could have been destructed by an aggressive compiler

Was there "sloppy code" that depended on the compiler being aggressive or not? Do you mean it could depend on the compiler version?

N-Dekker avatar Sep 19 '21 11:09 N-Dekker

@Leengit Interesting +1 Can you please be more specific about:

This original use supported sloppy code where a pointer was used even after the last SmartPointer could have been destructed by an aggressive compiler

Was there "sloppy code" that depended on the compiler being aggressive or not? Do you mean it could depend on the compiler version?

The code was functional as it was. The software writer who knew that there was an extra reference count lying around (until the extra UnRegister() call) would understand that the code was functional.

On the other hand, someone who assumed that our SmartPointer was like other smart pointer implementations, might have worried about the code. In my case, I looked up the rules for when temporary variables are destructed in C++. After looking that up myself ... I think the code would have been functional regardless -- so "sloppy" is a little too harsh -- but it did have me worried for a while.

Here's an example. How long does the temporary SmartPointer<x> created by x::New() last in this?:

LightObject::Pointer smartPtr = x::New().GetPointer();

If it were implemented by the compiler as

x * tmp = x::New().GetPointer();
LightObject::Pointer smartPtr = tmp;

then the temporary, unamed SmartPointer<x> created by x::New() would go out of scope as the first line was finished and should be UnRegistered then, before the Register() call for the smartPtr constructor in the second line. That would be bad.

However my worry seems to have been unnecessary. The C++ spec says something along the lines that nothing can be desctructed until the entire line of code has been executed, so the original, one-line implementation would/should actually work regardless of the compiler. But that's what had me worried.

Leengit avatar Sep 19 '21 13:09 Leengit

With my next push, I will reword the commit message and PR message to omit the "sloppy code" language. I was worried about it, but that's not the same thing!

Leengit avatar Sep 19 '21 14:09 Leengit

As a user, I would like to be able to occasionally place an ITK object on the stack, instead of using New(). Especially when performance matters, and heap allocation takes too much time, relative to the actual task. I tried to enable this by deriving from ITK, as follows:

#include <itkTranslationTransform.h>

class MyDerivedTransform : public itk::TranslationTransform<> {};

int main()
{
  MyDerivedTransform transformOnTheStack;
}

It compiles well, but it says, at run-time (using ITK 5.2):

WARNING: In itkLightObject.cxx, line 196 LightObject (...): Trying to delete object with non-zero reference count.

Will your PR solve this warning?

N-Dekker avatar Sep 19 '21 21:09 N-Dekker

Will your PR solve this warning?

Yes, I think it might fix that warning at that point in the code.

However, if you want to use that stack variable AND use it with the SmartPointer (because for instance you are calling a routine that has the SmartPointer as an argument) then I think you will have to make sure that the MyDerivedTransform constructors increment m_ReferenceCount and that the destructor decrements m_ReferenceCount. Note, while the constructors might instead call Register(), maybe the desctructor cannot call UnRegister() because it will call delete inappropriately -- unless maybe UnRegister() is overridden appropriately.

Brainstorming further: if MyDerivedTransform is going to have copy constructors or assignment operators too ... then that will have to be thought out. When the stack variable goes out of scope there will be deallocation regardless of outstanding SmartPointers, so all such SmartPointers will need to destruct first.

Leengit avatar Sep 20 '21 00:09 Leengit

Changing the way reference counting is handled during construction of object gives me quite a bit of trepidations.

First, have you run Valgrind with these changes? Otherwise we don't know the reference counting is still working.

Only in the constructors and destructors for the SmartPointer should we invoke the underlying object's Register() or UnRegister() methods to handle the smart pointer reference count. T

I would disagree with this. These have been exposed public API methods since the beginning of ITK and existing code relies on the current interface and behavior. ITK objects can be owned by other objects besides smart pointers.

The code change does not appear to fix any bugs. It seems to just change the existing behavior which ITK developers have relied on. 👎

Perhaps at the next major ITK release the objects interface for smart pointers could be refined but I am opposed to it during minor release changes.

Adding: This seems like a very high risk, low reward change.

blowekamp avatar Sep 20 '21 13:09 blowekamp

@blowekamp Thank you for the excellent feedback.

I have not tried Valgrind. I will figure out how that is done with ITK ... though if you have a quick pointer on how to do that, that would be much appreciated.

There could be an exception someplace that I missed, but I believe the extra reference count is not exposed through the public API. For example, it is just before the SmartPointer is exposed when the extra reference count (whether from Create or new) is UnRegistered in the original code for x::New(): https://github.com/InsightSoftwareConsortium/ITK/blob/eec9fe67056bfb5c693a4915b86eac9271e1dc83/Modules/Core/Common/include/itkMacro.h#L316-L327

(Or am I being too dense to understand that you are saying that the public API of ITK includes new x and/or ObjectFactory<x>::Create()?)

ITK objects can be owned by other objects besides smart pointers.

Please tell me more!

The code change doesn't change the public API (that I know) and doesn't fix any existing bug (that I know) but is primarily to head off future bugs. With the original implementation, someone who uses new x or ObjectFactory<x>::Create() inside of ITK is often required to also invoke UnRegister(), and whether that is needed or not depends upon a good understanding of our non-standard implementation.

Perhaps at the next major ITK release the objects interface for smart pointers could be refined but I am opposed to it during minor release changes.

For me, it hinges on whether there are any public API changes, such as the uses via other than SmartPointer that you mention.

Leengit avatar Sep 20 '21 14:09 Leengit

I am following the valgrind procedure from https://itk.org/Wiki/ITK/Dynamic_Analysis_Practices and will report back.

Leengit avatar Sep 20 '21 15:09 Leengit

An interesting (useful) feature of ITK's internal reference counting is that it allows "round-tripping" from smart to raw pointer and back, like this:

  {
    itk::SmartPointer<itk::Object> smartPtr1 = itk::Object::New();
    itk::Object* rawPtr = smartPtr1.GetPointer();
    itk::SmartPointer<itk::Object> smartPtr2{ rawPtr };  // OK
  }

@Leengit Would such a round-trip still be supported with your pull request?

Note that a similar round-trip between std::shared_ptr and raw pointer has undefined behavior:

  {
    std::shared_ptr<int> sharedPtr1 = std::make_shared<int>(1);
    int* rawPtr = sharedPtr1.get();
    std::shared_ptr<int> sharedPtr2{ rawPtr }; 
  }   // Undefined behavior! Likely to crash!

N-Dekker avatar Sep 21 '21 13:09 N-Dekker

@Leengit Would such a round-trip still be supported with your pull request?

Yes, that still works.

Leengit avatar Sep 21 '21 14:09 Leengit

As the commit message for the latest commit indicates: Keeping it simple: with this commit we are reverting pretty much all changes in the previous commit that are not directly related to the reference counts.

Leengit avatar Sep 21 '21 14:09 Leengit

I just fixed a spelling error in the commit message. I am re-running valGrind on this simpler commit to make sure that it doesn't break anything ... and I will report back.

Leengit avatar Sep 21 '21 14:09 Leengit

FWIW, the new implementation, of setting m_ReferenceCount to 0 instead of to 1 in the itk::LightObject constructor, mirrors the implementation we already have of setting ReferenceCount to 0 in the gdcm::Object constructor.

Leengit avatar Sep 21 '21 15:09 Leengit

@Leengit I like your pull request, but I think I agree with Bradley (@blowekamp) that it should not be done in a minor release. So then, what could be the next major release for this PR to be included? Would that still be ITK 5.x, or even ITK 6.0?

N-Dekker avatar Sep 23 '21 08:09 N-Dekker

I think that ITK 6.0 is not that far off. Maybe 2-3 years away? @thewtex could clarify the plan, but it also depends on the sentiment of the rest of us.

dzenanz avatar Sep 23 '21 13:09 dzenanz

Just force-pushed to rebase to current master.

Leengit avatar Sep 24 '21 16:09 Leengit

From valgrind I am getting 4 defects; see below. However, looking within memcheck_index.html and the web pages it links to doesn't tell me anything that is meaningful to me. Help in understanding what this is implies would be much appreciated!

Total Test time (real) = 18913.94 sec
-- Processing memory checking output:
932/2839 MemCheck: #923: itkImageFileWriterPastingTest2_6 .......................................................................   Defects: 1
933/2839 MemCheck: #924: itkImageFileWriterPastingTest2_7 .......................................................................   Defects: 1
940/2839 MemCheck: #931: itkImageFileWriterStreamingTest1_2 .....................................................................   Defects: 1
941/2839 MemCheck: #932: itkImageFileWriterStreamingTest1_3 .....................................................................   Defects: 1
MemCheck log files can be found here: ( * corresponds to test number)
~/git/InsightSoftwareConsortium/ITK-master-build2/Testing/Temporary/MemoryChecker.*.log
Memory checking results:
Invalid syscall param - 4
<h3>Dynamic analysis started on Sep 24 14:57 EDT</h3>
Site Name: | kitlee
-- | --
Build Name: | 20210924-1839-Experimental
Test Result: | passed
Name: | itkImageFileWriterPastingTest2_6
Path: | ./Modules/IO/ImageBase/test
Command: | /usr/bin/valgrind  "--log-file=~/git/InsightSoftwareConsortium/ITK-master-build/Testing/Temporary/MemoryChecker.923.log"  "-q" "--tool=memcheck" "--leak-check=yes" "--show-reachable=yes"  "--num-callers=50"  "~/git/InsightSoftwareConsortium/ITK-master-build/Wrapping/Generators/Python/itk/ITKIOImageBaseTestDriver"  "--compare"  "~/git/InsightSoftwareConsortium/ITK-master-build/ExternalData/Testing/Data/Baseline/IO/HeadMRVolume.mhd"   "~/git/InsightSoftwareConsortium/ITK-master-build/Testing/Temporary/itkImageFileWriterPastingTest2_6.mha"  "itkImageFileWriterPastingTest2"  "~/git/InsightSoftwareConsortium/ITK-master-build/ExternalData/Testing/Data/Input/HeadMRVolume.mha"   "~/git/InsightSoftwareConsortium/ITK-master-build/Testing/Temporary/itkImageFileWriterPastingTest2_6.mha"   "~/git/InsightSoftwareConsortium/ITK-master-build/ExternalData/Testing/Data/Input/HeadMRVolume.mha"

<pre>eJzdlMGO2jAQhu88xZxKWNEtCSRpLUCCLJFQD111kXpEE2dKrTU2sqctefs6yR5oBXvqpc0pGfv/PP5kZ14tH1ef5++qJSwW8SyZxNliAU+Nl6g1nNDhEZSVrKMDGXJKjuBklWEPbOG7wbp25D1WmqBqmCI/GlxwwoMMk/OsKIt0shY9CiLf898yHU8ame6fRP7+z2TVhOS63DwURQjycwgJEToLoZ21OnwU2hoqlabCGibDkedaiP1ens9xLESFXsm9Z6fMYS6/oRtDN6F93bNDxb4rL1/qoSMrka3rq7AEaY3nNy/Df507guhiO/eBLpLph/SGh3KSpbc82FPTaljpn9j4/8DCLE1uWYg301csrFoN/5iAMVRhA1c15PFVDdMyzfP+MGyPeOiuwBenmNwjeg7t7MhzEoVrOoZ2rbu7QH99crdgfv3sJZskyzIBR1QGou3u4/ZTR1qjpzb84NQPch1hGue/I1b9DyJQUlAejOUgA+XzsB4HXmtkWIN1EDmS4QbrZgRfHdGwvqQMBoNf/FSJZw==</pre>

Leengit avatar Sep 24 '21 19:09 Leengit

All 4 valgrind defects mentioned above are present on the master branch too ... apparently unrelated to this pull request.

I will force-push again, to rebase this branch to the current master.

Leengit avatar Sep 27 '21 14:09 Leengit

Asking github.com to search all projects of the InsightSoftwareConsortium repository finds calls to UnRegister outside of the ITK project only from:

The occurrences in ITKTubeTK look like they are independent of this pull request and the others are documentation and performance testing. If we are moving forward with this pull request I'd want to test ITKTubeTK against it, including with valgrind, to be sure. Can you recommend other repositories / projects beyond InsightSofware/Consortium/* that I should look in for use of UnRegister()?

Leengit avatar Sep 27 '21 15:09 Leengit

Can you recommend other repositories / projects beyond InsightSofware/Consortium/* that I should look in for use of UnRegister()?

Some of ITK's remotes are under https://github.com/KitwareMedical, so it may be worthwhile considering them.

jhlegarreta avatar Sep 27 '21 16:09 jhlegarreta

Some of ITK's remotes are under https://github.com/KitwareMedical, so it may be worthwhile considering them.

Thanks! Yes, work would need to be done there. Specifically, of the occurrences of UnRegister() that appear there,

the ITKUltrasound files do rely on the reference count scheme that I am proposing be replaced. The UnRegister() commands there would have to be removed.

Leengit avatar Sep 27 '21 17:09 Leengit

A proposal: we could could change the code in KitwareMedical/ITKUltrasound from

  static Pointer
  New(void)
  {
    Pointer smartPtr = ObjectFactory<Self>::Create();
    if (smartPtr.GetPointer() == nullptr)
    {
      smartPtr = new Self;
    }
    smartPtr->UnRegister();
    return smartPtr;
  }

to

  static Pointer
  New(void)
  {
    Pointer smartPtr = ObjectFactory<Self>::Create();
    if (smartPtr.GetPointer() == nullptr)
    {
      smartPtr = new Self;
    }
    if (smartPtr->GetReferenceCount() > 1)
    {
        smartPtr->UnRegister();
    }
    return smartPtr;
  }

This will work whether compiled against the current ITK or the proposed ITK. We could also insert something about legacy remove....

Leengit avatar Sep 27 '21 18:09 Leengit

With legacy-remove functionality

  static Pointer
  New(void)
  {
    Pointer smartPtr = ObjectFactory<Self>::Create();
    if (smartPtr.GetPointer() == nullptr)
    {
      smartPtr = new Self;
    }
#if !defined(ITK_FUTURE_LEGACY_REMOVE)
    if (smartPtr->GetReferenceCount() > 1)
    {
        smartPtr->UnRegister();
    }
#endif
    return smartPtr;
  }

Leengit avatar Sep 27 '21 18:09 Leengit

We have played with Register/Unregister to fix a memory leak in the Cuda context manager in RTK (a singleton), see SimonRit/RTK@35aa60360d5a1d4777a111dc861360cc3c45176c. The code is originally based on the GPU context manager (in ITK) but has evolved independently. I don't really understand this commit (now) so maybe it should be revisited. @jjomier

SimonRit avatar Sep 28 '21 08:09 SimonRit

I find several places in ITK itself where we fail to use UnRegister() in the current implementation of a New() method, so these are objects that will not be subject to delete when the last SmartPointer is destructed. If we defer the present pull request, we should fix these and any others like these.

https://github.com/InsightSoftwareConsortium/ITK/blob/f37e2cb211240e82d41f1f6a1b5e4b25797e793e/Modules/Filtering/FFT/include/itkForwardFFTImageFilter.hxx#L65-L79

https://github.com/InsightSoftwareConsortium/ITK/blob/f37e2cb211240e82d41f1f6a1b5e4b25797e793e/Modules/Filtering/FFT/include/itkHalfHermitianToRealInverseFFTImageFilter.hxx#L65-L77

https://github.com/InsightSoftwareConsortium/ITK/blob/f37e2cb211240e82d41f1f6a1b5e4b25797e793e/Modules/Filtering/FFT/include/itkInverseFFTImageFilter.hxx#L66-L78

https://github.com/InsightSoftwareConsortium/ITK/blob/f37e2cb211240e82d41f1f6a1b5e4b25797e793e/Modules/Filtering/FFT/include/itkRealToHalfHermitianForwardFFTImageFilter.hxx#L64-L78

https://github.com/InsightSoftwareConsortium/ITK/blob/f37e2cb211240e82d41f1f6a1b5e4b25797e793e/Modules/Filtering/FFT/include/itkComplexToComplexFFTImageFilter.hxx#L78-L91

Also, I would look deeper into the following example, which uses AutoPointer rather than SmartPointer

https://github.com/InsightSoftwareConsortium/ITK/blob/f37e2cb211240e82d41f1f6a1b5e4b25797e793e/Modules/Core/QuadEdgeMesh/include/itkQuadEdgeMeshPolygonCell.hxx#L96-L104

Leengit avatar Sep 29 '21 15:09 Leengit

Also, some objects in Modules/ThirdParty/GDCM/src/gdcm/Source/ have a New() method without a call to UnRegister() and, if the present pull request is deferred, might need to be fixed -- it depends upon whether that ThirdParty code is using our SmartPointer and Object, or not.

Leengit avatar Sep 29 '21 15:09 Leengit

@malaterre should be able to answer the question about Modules/ThirdParty/GDCM/src/gdcm/Source/.

dzenanz avatar Sep 29 '21 19:09 dzenanz

@Leengit the smartpointer implementation in GDCM reproduce the behavior of ITK smartpointer. So I suspect the issue is also in the GDCM codebase. I believe this can be fixed at a later point, this is not a blocker for this PR. Thanks.

malaterre avatar Sep 30 '21 06:09 malaterre

Given the various never deallocated objects identified in the code in these recent comments, this issue might more properly be labeled BUG than an ENH. Which way should we go?:

  1. Should we add calls to UnRegister() in those places that don't yet remove the m_ReferenceCount that was set by the itkLightObject constructor, or
  2. should we not set that reference count from that constructor, and remove calls to UnRegister() from those places that have been removing that reference count?

This PR should achieve Option 2. If we go with Option 2, is it for ITK 5.3.1, 5.4, or 6.0? If we defer Option 2 beyond 5.3.1 then should we implement Option 1 for the short term?

My 2¢ towards answering my own questions: Option 2 is a change to the ITK API only in the sense that module writers who knew of the previous behavior will find it changing. In contrast, module writers who did not know of the current behavior will be happy that their memory leaks are being fixed. Other than the module writers, those who use ITK and its modules will not see an API change. I'd favor having Option 2 in 5.3.1.

Leengit avatar Sep 30 '21 12:09 Leengit