ITKUltrasound icon indicating copy to clipboard operation
ITKUltrasound copied to clipboard

WIP: PERF: Use NeighborhoodRange for metric image computation

Open thewtex opened this issue 6 years ago • 10 comments

Depends on:

http://review.source.kitware.com/#/c/23795/4

Currently 4X slower

Closes #97

thewtex avatar Oct 17 '18 21:10 thewtex

CC: @N-Dekker @phcerdan

Experimenting with the proposed neighborhood range constant pixel access policy. :microscope: Currently, this seems to be quite a bit slower, though. :snail:

thewtex avatar Oct 17 '18 22:10 thewtex

Hey @thewtex, what script/test are you using for testing the performance?

phcerdan avatar Oct 18 '18 00:10 phcerdan

@phcerdan itkBlockMatchingBayesianRegularizationDisplacementCalculatorTest.

thewtex avatar Oct 18 '18 02:10 thewtex

This use case might suggest adding ShapedImageNeighborhoodRange::SetLocation(const IndexType&), so that the ranges could be declared and initialized outside the loop, and then have their location updated inside the loop, by calling range.SetLocation(denomIt.GetIndex()) for each range.

Update: The member function is added with patch set 5: http://review.source.kitware.com/#/c/23795/5

N-Dekker avatar Oct 18 '18 10:10 N-Dekker

Great ideas @N-Dekker ! :bulb:

And thanks for adding the SetLocation patch to move reconstruction out of the loop. I will try that and manually expanding the inner product. This will probably happen next week.

On a related note, I was examining the patch and the existing code, and the construction of the proxy in operater*():

https://github.com/InsightSoftwareConsortium/ITK/blob/a5d40b033f809b8d54e0bda9bd1eb5e7613faf69/Modules/Core/Common/include/itkShapedImageNeighborhoodRange.h#L348-L351

Do you think the construction here has performance implications and would it be avoidable?

thewtex avatar Oct 18 '18 17:10 thewtex

I was examining the patch and the existing code, and the construction of the proxy in operater*(): Do you think the construction here has performance implications and would it be avoidable?

The use of a proxy as return type of operater*() is hard to avoid, for a constant-boundary iterator. Because operater*() cannot return a 'real' reference to an existing pixel from the image, when trying to retrieve a pixel value outside the image boundaries.

However, looking at http://review.source.kitware.com/#/c/23795/5/Modules/Core/Common/include/itkConstantBoundaryImageNeighborhoodPixelAccessPolicy.h my intuition tells me that it might be possible to squeeze some CPU cycles out of the private helper function CalculatePixelIndexValue, which is called with every operater*(). CalculatePixelIndexValue has an if statement and a return inside its for loop which might have performance implications... but I'm not sure! The compiler optimizer might be smarter than my intuition!

Update: With patch set 7 I adjusted the private helper function CalculatePixelIndexValue, removing the return from inside its for loop, and indeed, it appears to significantly improve the performance! Using VS2017 64-bit Release, I observed a reduction of more than 30% of the run-time duration!

N-Dekker avatar Oct 18 '18 21:10 N-Dekker

Experimenting with the proposed neighborhood range constant pixel access policy. 🔬 Currently, this seems to be quite a bit slower, though. 🐌

@thewtex Which compiler do you use? (Release build, optimized for speed?) I'm asking because I observed significant PERF differences between VS2015 and VS2017 (both 64-bit Release), while running the example code that I posted at https://discourse.itk.org/t/custom-border-extrapolation-of-shapedimageneighborhoodrange-by-imageneighborhoodpixelaccesspolicy/879/27

N-Dekker avatar Oct 21 '18 10:10 N-Dekker

@N-Dekker this was Clang / MinSizeRel build. I will try other compilers and other build configurations, along with your example code!

thewtex avatar Oct 21 '18 22:10 thewtex

@thewtex I'm interested to see your results! Using VS2017 Release, I found that NeighborhoodRange based iteration was almost 3x faster than the old-school itk::ConstNeighborhoodIterator. Unfortunately, I have bad news... or is it good news?!? With the ConstNeighborhoodIterator patch http://review.source.kitware.com/#/c/23813/ that I just posted, the old-school iteration actually beats the NeighborhoodRange!!! At least when using VS2017 Release, on the example that I posted at https://discourse.itk.org/t/custom-border-extrapolation-of-shapedimageneighborhoodrange-by-imageneighborhoodpixelaccesspolicy/879/27 However, there are still some good reasons to consider NeighborhoodRange:

  • The range class supports very fast change of location to an arbitrary position, especially when we add ShapedImageNeighborhoodRange::SetLocation (while the old itk::ConstNeighborhoodIterator::SetLocation is clearly "not optimized for speed").
  • It is thread-safe (wheras the old itk::ConstNeighborhoodIterator has unsafe mutable data).
  • It has a modern C++ interface.

N-Dekker avatar Oct 22 '18 11:10 N-Dekker

Performance results discussed here:

https://discourse.itk.org/t/custom-border-extrapolation-of-shapedimageneighborhoodrange-by-imageneighborhoodpixelaccesspolicy/879/29

thewtex avatar Nov 07 '18 18:11 thewtex