elastix icon indicating copy to clipboard operation
elastix copied to clipboard

BUG: Fix OpenCL pyramids (original: https://github.com/SuperElastix/elastix/pull/243)

Open ntatsisk opened this issue 2 years ago • 2 comments

This PR provides the cosmetic changes needed for PR https://github.com/SuperElastix/elastix/pull/243 together with a small addition. Hence, full credits should be given to @squll1peter for writing up the original PR and resolving the CL_OUT_OF_RESOURCES issue discussed in PR https://github.com/SuperElastix/elastix/pull/243 and issue https://github.com/SuperElastix/elastix/issues/70! Two different bugs are actually solved. In short summary:

  1. The CL_OUT_OF_RESOURCES error. This bug was created because itk::SmoothingRecursiveGaussianImageFilter was overriding (here: https://github.com/InsightSoftwareConsortium/ITK/blob/3c6de8f009aadfab8f801070237cb699f5c895cb/Modules/Filtering/Smoothing/include/itkSmoothingRecursiveGaussianImageFilter.h#L80-L87) the internal pixel type of RecursiveGaussianImageFilter member objects using the templated struct Rebind. However, Rebind was only implemented for itk::Image and not itk::GPUImage, making the latter call the implementation of the former and leading to mixing of the two types. The solution here was to create the equivalent Rebind for itk::GPUImage

  2. A crash was appearing during the start of the second registration resolution because only the first output of the pyramid member object was communicated/grafted to the FixedGenericPyramid / MovingGenericPyramid objects. A simple for-loop to graft all outputs was added as solution.


In addition to the above bug-fixing changes, the class signature of itkGPUResampleImageFilter was updated to match the one of itkResampleImageFilter where an extra templated parameter was added (TTransformPrecisionType). The itkGPUResampleImageFilterFactory was updated accordingly.

An important note: Several of the related tests were passing successfully before this changes, where clearly they shouldn't. So, some work needs to be done there to make them effective.

A second note is that memory allocation enhancement, provided in the original PR, is not implemented here, but it should be looked into the future.

ntatsisk avatar Sep 12 '22 12:09 ntatsisk

Great work!

A second note is that memory allocation enhancement, provided in the original PR, is not implemented here, but it should be looked into the future.

I think a new test will make this bug visible, when run on a modern GPU.

mstaring avatar Sep 12 '22 14:09 mstaring

This PR & comment summarized the reasoning and code elegantly, I really appreciate it!

squll1peter avatar Sep 12 '22 15:09 squll1peter

I added some changes to the pyramid tests because I noticed they were all passing before this PR, which clearly showed that the bug/crash in the code was not really covered in the tests.

The relevant test here is the GPUGenericMultiResolutionPyramid test. The initial reason that it was passing was that in fact the CPU version of the Resampler was being called instead of the GPU, which indicates that the corresponding Factory+overloaded function (this one: https://github.com/SuperElastix/elastix/blob/fdd71315d26ef61b814f77ec939c0e10951e537b/Common/OpenCL/Factories/itkGPUResampleImageFilterFactory.h#L84-L88) was not doing its job. Looking into it further, it appears that PrecisionType was not specified for the GenericMultiResolutionPyramid inside the test, which in turn meant that default double was used. Still, the corresponding Factory should have worked, but it seems that it doesn't, which is a bug. Here, I just added the PrecisionType=float in order to call a different overloaded function of the Factory, which in fact works (the same used inside the code). So, in this way we avoid using the problematic factory but still the bug is there (maybe it is in the ITK side, since all the factor magic happens inside the itkObjectFactoryBase.h file).

As expected, this change allowed the GPUResampler to get called as it should. However, the test was still passing successfully! Reminder: the equivalent part of the code crashes. Why? I don't understand fully all the details, but the high-level difference is that inside the code we call GenericMultiResolutionPyramidImageFilter<GPUInputImageType, GPUOutputImageType, PrecisionType> while inside the test: GenericMultiResolutionPyramidImageFilter<InputImageType, OutputImageType, PrecisionType>. So, I added an extra test to cover also for the case of calling the pyramid filter with itk::GPUImages instead of itk::Images. This new test fails as expected before the changes introduced here, and passes after the changes in this PR.

ntatsisk avatar Sep 28 '22 14:09 ntatsisk

To explain a bit better the problem with the factory registration, I created a minimal example:

#include "itkProcessObject.h"
#include "itkObjectFactoryBase.h"
#include "itkImage.h"

namespace itk
{

// MyFilter
template <typename TInputImage, typename TParameter=double>
class MyFilter : public ProcessObject 
{
public:
  using Self = MyFilter;
  using Superclass = ProcessObject;
  using Pointer = SmartPointer<Self>;
  using ConstPointer = SmartPointer<const Self>;
		  
  itkNewMacro(Self);
  itkTypeMacro(MyFilter, ProcessObject);
  
}; // end MyFilter

// SubFilter
template <typename TInputImage, typename TParameter=double>
class SubFilter : public MyFilter<TInputImage, TParameter> 
{
public:
  using Self = SubFilter;
  using Superclass = MyFilter;
  using Pointer = SmartPointer<Self>;
  using ConstPointer = SmartPointer<const Self>;
		  
  itkNewMacro(Self);
  itkTypeMacro(SubFilter, MyFilter);
  
}; // end SubFilter

// SubFilterFactory
class SubFilterFactory : public itk::ObjectFactoryBase
{
public:
  using Self = SubFilterFactory;
  using Superclass = itk::ObjectFactoryBase;
  using Pointer = itk::SmartPointer<Self>;
  using ConstPointer = itk::SmartPointer<const Self>;

  itkFactorylessNewMacro(Self);
  itkTypeMacro(SubFilterFactory, itk::ObjectFactoryBase);

  static void
  RegisterOneFactory()
  {
    auto factory = SubFilterFactory::New();

    itk::ObjectFactoryBase::RegisterFactory(factory);
  }
  
  const char *
  GetITKSourceVersion() const override
  {
    return "itk version 5.3.0";
  }
  const char *
  GetDescription() const override
  {
    return "A Factory";
  }
  
private:
  SubFilterFactory()
  {
    using InputImage = Image<float, 3>;

    // double -> float
    this->RegisterOverride(typeid(MyFilter<InputImage, double>).name(),
							typeid(SubFilter<InputImage, float>).name(),
							" ",
							true,
							CreateObjectFunction<SubFilter<InputImage, float>>::New());

    // float -> float
    this->RegisterOverride(typeid(MyFilter<InputImage, float>).name(),
							typeid(SubFilter<InputImage, float>).name(),
							" ",
							true,
							CreateObjectFunction<SubFilter<InputImage, float>>::New());

  }	    
}; // end SubFilterFactory

} // end itk

int
main(int argc, char * argv[])
{
  using InputImage = itk::Image<float, 3>;
  using MyFilterFloatType = itk::MyFilter<InputImage, float>;
  using MyFilterDoubleType = itk::MyFilter<InputImage, double>;

  // Instantiate MyFilter class as usual
  auto myFilterFloat = MyFilterFloatType::New();
  auto myFilterDouble = MyFilterDoubleType::New();

  std::cout << "Before factory registration:" << std::endl;
  std::cout << typeid(*myFilterFloat).name() << std::endl;
  std::cout << typeid(*myFilterDouble).name() << "\n" << std::endl;

  // Register factory
  itk::SubFilterFactory::RegisterOneFactory();
  
  // From now on, only SubFilters should be instantiated
  auto subFilterFloat = MyFilterFloatType::New();
  auto subFilterDouble = MyFilterDoubleType::New();

  std::cout << "After factory registration:" << std::endl;
  std::cout << typeid(*subFilterFloat).name() << std::endl;
  std::cout << typeid(*subFilterDouble).name() << std::endl;


} // end main()

The output is:

Before factory registration:
class itk::MyFilter<class itk::Image<float,3>,float>
class itk::MyFilter<class itk::Image<float,3>,double>

After factory registration:
class itk::SubFilter<class itk::Image<float,3>,float>
class itk::MyFilter<class itk::Image<float,3>,double>

As you can see the conversion from double->float fails, while the float->float succeeds. @N-Dekker, what are your thoughts? How should we move forward?

ntatsisk avatar Sep 29 '22 10:09 ntatsisk

I see, the dynamic_cast from SubFilter<itk::Image<float,3>,float>* to MyFilter<itk::Image<float,3>,double>>* fails, in ObjectFactory<itk::MyFilter<itk::Image<float,3>,double>>::Create(). At https://github.com/InsightSoftwareConsortium/ITK/blob/1e003467e263d31bfb02589ce2449a413dbe4fa2/Modules/Core/Common/include/itkObjectFactory.h#L63

N-Dekker avatar Sep 29 '22 11:09 N-Dekker

After some discussions with @N-Dekker, we agreed to have simply one test for GPUGenericMultiResolutionPyramid instead of two i.e. to keep the one which covers how the filter is instantiated in the actual elastix code. In addition, some duplicate code was simplified in that test. Lastly, based on the discussion regarding the factories above , I added a commit that removes all GPUResampler factories that were not functional (=they were attempting without success to override double with float) and, because of that, the GPUFactories test had to be updated as well.

If there are no further comments/questions/discussion points, I think the PR can be merged. :)

ntatsisk avatar Oct 04 '22 16:10 ntatsisk