opencv_contrib
opencv_contrib copied to clipboard
add interleaved versions of phase/cartToPolar/polarToCart
This PR is for performance only (at the cost of more template code and increased GPU code size) The additional variants can help the caller skip the creation of temporary GPU mats (where memory is more likely to be a critical resource), and can even allow in-place processing. magnitude/angles/x/y are often already interleaved when dealing with DFTs.
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [X] I agree to contribute to the project under Apache 2 License.
- [X] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [X] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [X] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name.
- [X] The feature is well documented and sample code can be built with the project CMake
@cudawarped could you take a look?
@cudawarped could you take a look?
Of course, but I may not have time before the release of 4.9.0.
@chacha21 You will need to squash and rebase this onto the tip of the 4.x branch as the CUDA CMake configuration has changed in the main repo since you submited this PR so I think it will fail on the CI.
@chacha21 You will need to squash and rebase this onto the tip of the 4.x branch as the CUDA CMake configuration has changed in the main repo since you submited this PR.
Is this OK after "Merge branch '4.x'" ? My brain has never accepted git terminology, I am not sure about the good operation (with GitHub Desktop)
I rebased the patch to current 4.x and got build error with Cuda 11.8 and Ubuntu 18.04:
[ 14%] Processing OpenCL kernels (core)
/home/alexander/Projects/OpenCV/opencv_contrib/modules/cudev/include/opencv2/cudev/functional/detail/../functional.hpp(625): error: a class or namespace qualified name is required
/home/alexander/Projects/OpenCV/opencv_contrib/modules/cudev/include/opencv2/cudev/functional/detail/../functional.hpp(643): error: a class or namespace qualified name is required
/home/alexander/Projects/OpenCV/opencv_contrib/modules/cudev/include/opencv2/cudev/functional/detail/../functional.hpp(625): error: a class or namespace qualified name is required
/home/alexander/Projects/OpenCV/opencv_contrib/modules/cudev/include/opencv2/cudev/functional/detail/../functional.hpp(643): error: a class or namespace qualified name is required
2 errors detected in the compilation of "/home/alexander/Projects/OpenCV/opencv-master/modules/core/src/cuda/gpu_mat_nd.cu".
CMake Error at cuda_compile_1_generated_gpu_mat_nd.cu.o.Release.cmake:280 (message):
Error generating file
/home/alexander/Projects/OpenCV/opencv_contrib_build/modules/core/CMakeFiles/cuda_compile_1.dir/src/cuda/./cuda_compile_1_generated_gpu_mat_nd.cu.o
modules/core/CMakeFiles/opencv_core.dir/build.make:82: recipe for target 'modules/core/CMakeFiles/cuda_compile_1.dir/src/cuda/cuda_compile_1_generated_gpu_mat_nd.cu.o' failed
make[3]: *** [modules/core/CMakeFiles/cuda_compile_1.dir/src/cuda/cuda_compile_1_generated_gpu_mat_nd.cu.o] Error 1
make[3]: *** Ожидание завершения заданий…
2 errors detected in the compilation of "/home/alexander/Projects/OpenCV/opencv-master/modules/core/src/cuda/gpu_mat.cu".
CMake Error at cuda_compile_1_generated_gpu_mat.cu.o.Release.cmake:280 (message):
Error generating file
/home/alexander/Projects/OpenCV/opencv_contrib_build/modules/core/CMakeFiles/cuda_compile_1.dir/src/cuda/./cuda_compile_1_generated_gpu_mat.cu.o
modules/core/CMakeFiles/opencv_core.dir/build.make:75: recipe for target 'modules/core/CMakeFiles/cuda_compile_1.dir/src/cuda/cuda_compile_1_generated_gpu_mat.cu.o' failed
I don't have such a problem with CUDA 12.4 under Visual Studio 2022 Could it be related to b330b6c5 ?
Another issue with Cuda 12.5 on Ubuntu 20.04:
40%] Building NVCC (Device) object modules/core/CMakeFiles/cuda_compile_1.dir/src/cuda/cuda_compile_1_generated_gpu_mat.cu.o
[ 40%] Building NVCC (Device) object modules/core/CMakeFiles/cuda_compile_1.dir/src/cuda/cuda_compile_1_generated_gpu_mat_nd.cu.o
/home/ksenia/Projects/opencv_contrib/modules/cudev/include/opencv2/cudev/functional/detail/../functional.hpp(625): error: a class or namespace qualified name is required
__attribute__((device)) __inline__ __attribute__((always_inline)) typename T operator ()(typename TypeTraits<T2>::parameter_type ab) const
^
/home/ksenia/Projects/opencv_contrib/modules/cudev/include/opencv2/cudev/functional/detail/../functional.hpp(625): error: a class or namespace qualified name is required
__attribute__((device)) __inline__ __attribute__((always_inline)) typename T operator ()(typename TypeTraits<T2>::parameter_type ab) const
^
/home/ksenia/Projects/opencv_contrib/modules/cudev/include/opencv2/cudev/functional/detail/../functional.hpp(643): error: a class or namespace qualified name is required
__attribute__((device)) __inline__ __attribute__((always_inline)) typename T operator ()(typename TypeTraits<T2>::parameter_type ab) const
^
/home/ksenia/Projects/opencv_contrib/modules/cudev/include/opencv2/cudev/functional/detail/../functional.hpp(643): error: a class or namespace qualified name is required
__attribute__((device)) __inline__ __attribute__((always_inline)) typename T operator ()(typename TypeTraits<T2>::parameter_type ab) const
^
2 errors detected in the compilation of "/home/ksenia/Projects/opencv/modules/core/src/cuda/gpu_mat_nd.cu".
2 errors detected in the compilation of "/home/ksenia/Projects/opencv/modules/core/src/cuda/gpu_mat.cu".
CMake Error at cuda_compile_1_generated_gpu_mat_nd.cu.o.Release.cmake:280 (message):
Error generating file
/home/ksenia/Projects/opencv_contrib-build/modules/core/CMakeFiles/cuda_compile_1.dir/src/cuda/./cuda_compile_1_generated_gpu_mat_nd.cu.o
I rebased the local branch to 4.x to include all patches for CUDA 12.x.
I think it must be related to some "tuple" name clashing in my calls to gridTransformTuple() and the make_tuple() or tie() helper.
I wish I could get rid of the complex "gridTransformTuple()" abstraction.
I will investigate.
the PR:4.x / Ubuntu2004-ARM64 / BuildAndTest build failure does not seem to be related to the current pull request