opencv_contrib
opencv_contrib copied to clipboard
Try to Fix #3725: cudaarithm: fix the compile faiure of CUDA 12.
A slight API change of nppiMeanStdDevGetBufferHostSize_8u_C1R and nppiMeanStdDevGetBufferHostSize_32f_C1R in NPP of CUDA 12 has caused the #3725. I will try to fix this. I found that the type of bufSize is size_t instead of int in the reductions.cpp in the NPP header file, where the nppi_statistics_functions.h changed the type of second parameter from * int to * size_t.
nppi_statistics_functions.h 5392:5408
NppStatus
nppiMeanStdDevGetBufferHostSize_32f_C1R_Ctx(NppiSize oSizeROI, size_t * hpBufferSize/* host pointer */, NppStreamContext nppStreamCtx);
/**
* Buffer size for \ref nppiMean_StdDev_32f_C1R.
*
* For common parameter descriptions, see \ref CommonMeanStdDevGetBufferHostSizeParameters.
*
*/
NppStatus
nppiMeanStdDevGetBufferHostSize_32f_C1R(NppiSize oSizeROI, size_t * hpBufferSize/* host pointer */);
/**
* Buffer size for \ref nppiMean_StdDev_8u_C1MR.
*
* For common parameter descriptions, see \ref CommonMeanStdDevGetBufferHostSizeParameters.
*
*/
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
/cc @cudawarped
I checked the denfition of CUDA_VERSION in CUDA 12 which is the 12.XX.XX instad of a int number
@sdy623
I think you are confusing CMake generation and compilation. Adding that definition which is still for the incorrect verison of CUDA into the CMake file is unecessary.
Version Info
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda.h
> /**
> * CUDA API version number
> */
> #define CUDA_VERSION 12040
Function definitions
- C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\nppi_statistics_functions.h
NppStatus
nppiMeanStdDevGetBufferHostSize_8u_C1R_Ctx(NppiSize oSizeROI, int * hpBufferSize/* host pointer */, NppStreamContext nppStreamCtx);
- C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\include\nppi_statistics_functions.h
NppStatus
nppiMeanStdDevGetBufferHostSize_8u_C1R_Ctx(NppiSize oSizeROI, size_t * hpBufferSize/* host pointer */, NppStreamContext nppStreamCtx);
<int &, int &>, int, cuda::std::__4::tuple<cv::cudev::minimum
, cv::cudev::maximum >) blockReduce<BLOCK_SIZE>(smem_tuple(sminval, smaxval),
I checked the denfition of CUDA_VERSION in CUDA 12 which is the 12.XX.XX instad of a int number
@sdy623
I think you are confusing CMake generation and compilation. Adding that definition which is still for the incorrect verison of CUDA into the CMake file is unecessary.
Version Info
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda.h
> /** > * CUDA API version number > */ > #define CUDA_VERSION 12040Function definitions
- C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\nppi_statistics_functions.h
NppStatus nppiMeanStdDevGetBufferHostSize_8u_C1R_Ctx(NppiSize oSizeROI, int * hpBufferSize/* host pointer */, NppStreamContext nppStreamCtx);
- C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\include\nppi_statistics_functions.h
NppStatus nppiMeanStdDevGetBufferHostSize_8u_C1R_Ctx(NppiSize oSizeROI, size_t * hpBufferSize/* host pointer */, NppStreamContext nppStreamCtx);
I deleted the edit of the CMake and find the correct CUDA_VERSION is 12040 in CUDA 12.4. I found a way to check the cuda.h without install it, just use 7-Zip to open the cuda_12.4.1_551.78_windows.exe and find this directory cuda_12.4.1_551.78_windows.exe\cuda_cudart\cudart\include\cuda.h and found the macro denfination
/**
* CUDA API version number
*/
#define CUDA_VERSION 12040
I found a way to check the cuda.h without install it, just use 7-Zip to open the cuda_12.4.1_551.78_windows.exe and find this directory cuda_12.4.1_551.78_windows.exe\cuda_cudart\cudart\include\cuda.h and found the macro denfination
Whilst this is a completely valid way to check the header I would advise you to install CUDA 12.4 when submitting a PR which fixes something that it breaks.
If you do that you will realize that
-
Your conversions from
size_ttointare throwing warningswarning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data
which you should address when using the new buffer size type as I mentioned above. e.g.
CV_Assert(bufSize <= std::numeric_limits<int>::max()); GpuMat buf = pool.getBuffer(1, static_cast<int>(bufSize), gsrc.type()); -
Because of the existing bug I mentioned before https://github.com/opencv/opencv_contrib/issues/3690 you will be unable to build the
cudaarithmmodule even with this fix. i.e.cmake --build . --target opencv_cudaarithm
still fails.
If I was authoring this PR I would install both CUDA 12.3 and 12.4 and check that this builds on both without errors.
@opencv-alalek this resolves the issue mentioned but still results in build errors because cudaarithm uses cudev ( https://github.com/opencv/opencv_contrib/issues/3690). I suggest this is revisited when cudev can be built against CUDA 12.4.
I still have an issue https://github.com/opencv/opencv_contrib/issues/3728
I still have an issue #3728
@jiapei100 Your issue is related to cudev (https://github.com/opencv/opencv_contrib/issues/3690) not cudaarithm.
great job! thank you