opencv_contrib icon indicating copy to clipboard operation
opencv_contrib copied to clipboard

Implemented fast median filter for CUDA using Wavelet Matrix, a constant-time, HDR-compatible method

Open TumoiYorozu opened this issue 5 months ago • 8 comments

I replaced the existing CUDA implementation of the histogram-based median filter with an implementation of a new wavelet matrix-based median filter algorithm, which I presented at SIGGRAPH Asia 2022. This paper won the Best Paper Award in the journal track of technical papers (ACM Transactions on Graphics).

This new algorithm, like the histogram method, has the property that the window radius does not affect the computation time, and is several times faster than the histogram method. Furthermore, while the histogram method does not support HDR and only supports 8U images, this new algorithm supports HDR and also supports 16U and 32F images.

I (the author) have published the implementation on my personal GitHub and made some modifications for OpenCV to make it accessible from OpenCV. I used the CUB library, which is part of the standard toolkit since CUDA 11.0. Therefore, depending on the CUDA_VERSION, the code is written to use the new algorithm for versions 11.0 and above, and the existing histogram method for versions 10 and below.

Regarding the old histogram-based code, the CPU version of the median filter supports 16U and 32F for window sizes up to 5, but it seems that the histogram CUDA version of the median filter does not. Also, the number of channels supported is different: the CPU version supports 1, 3, and 4 channels, while the CUDA version supports only 1 channel. In addition, for the CUDA version of the histogram method, pixels at the edges of the image, i.e. where the window is insufficient, were set to zero. For example, if the window size is 7, the width of the 3 pixels at the top, bottom, left, and right were not calculated correctly. When checking the tests, it was found that they compared with the CPU version by cropping the edges with rect, and also the cropping area was too wide, with 8 pixels cropped from the top, bottom, left, and right when the window size was 7.

In this PR, I first corrected the rect range for the tests so that both the old histogram method and the new wavelet matrix method can pass. Also, the CUDA version now supports 16U, 32F, and multi-channel formats such as 3 and 4 channels. In addition, while the CPU version only supports window sizes up to 5 for HDR, the new CUDA Wavelet Matrix method supports sizes of 7 and above. Additionally, I have added new tests for 16U, 32F, and multi-channel formats, specifically 3 and 4 channels.

Paper’s project page: Constant Time Median Filter using 2D Wavelet Matrix | Interactive Graphics & Engineering Lab My implementation (as author): GitHub - TumoiYorozu/WMatrixMedian

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • [x] I agree to contribute to the project under Apache 2 License.
  • [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • [x] The PR is proposed to the proper branch ~~- [ ] There is a reference to the original bug report and related work~~
  • [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name.
  • [x] The feature is well documented and sample code can be built with the project CMake

TumoiYorozu avatar Jan 22 '24 23:01 TumoiYorozu