stereo-sgm-opencl
stereo-sgm-opencl copied to clipboard
program hangs at clBuildProgram
program hangs on building of the "m_down2up" program. there are no error messages whatsoever, the program simply hangs on this line. the previous programs build successfully.
Hi, could you please share more details, like what hardware, os, compiler, device driver are you using?
I tried with the latest nvidia driver and #6 issue came up, this can be connected to your issue as well, could you please try the latest master, if it fixes for you?
On Windows:
- Compiler: MSVC 19.34.31937.0
- Hardware: NVIDIA T1000, Driver: NVIDIA 30.0.14.7239
Ubuntu 20.04
- Compiler: GNU 9.4.0
- Hardware: NVIDIA TITAN RTX, Driver Version: 515.65.01
Unfortunately, trying the latest master did no fix my problem on either OS. Thank you very much for your help!
I have tried both on windows and Linux, but unfortunately I couldn't reproduce the issue. Are you using the stereo move example? I updated it to handle better if the input cannot be opened, or the input image has zero size. Could you pull the latest, clean build and test it again? Also is there any message on the command line?
Hi, yes I did try it with the new master, but it didn't work. And since the code hangs on one line there is no error message. A friend of mine could make it work, but we don't know already what is wrong in my case. Sine I tracked down the line of code which does not work in debug mode, I also know, that loading images is not the problem.
Is it possible, that your computer has multiple platform available? For example: intel, nvidia, amd etc. You can select the platform by specifying the --platform_idx=X
, where X is the index of the platform on your computer, the default value is 0.
Also, when the application starts, it should print the platform, and device name it is using, in my case for example:
Platform name: NVIDIA CUDA
Device name: NVIDIA GeForce GTX 1070
Yes, this works correctly. It selects NVIDIA CUDA as well and then the NVIDIA T1000 graphics card. Afterwards the program hangs.
Is it possible that I need the correct version of CUDA and the correct driver. Also, what built toolchain / compiler do you use?
I have managed to reproduce the issue, at the office we have a RTX3090, and the same issue is present. I tried to debug, decoupled the compile/link of programs, and it hangs on the link phase. I don't know yet what causes the issue, but I think it could be related to GPU architecture, because older 1080, 1070 Pascal GPUS works fine, but Turing and later not that much
Actually I came to a similar conclusion. The only hardware I was able to run the code correctly without the clBuildProgram to hang itself was my very old GTX 760. Same as my colleague, he uses a GTX 1050 and it works also fine.
For the other hardware I found, that disabling optimization for the program build would ressolve matters, but naturally leads to slower execution time. More precisely I use the option "-cl-opt-disable" for clBuildProgram:
const char* options = {"-cl-opt-disable"};
err = clBuildProgram(m_cl_program, 1, &device, options, nullptr, nullptr);
in file device_kernel.h.
To your suggestion about architecture: I could run the program on nvidia hardware with architecture kepler, pascal and ampere but not on turing architecture. That might be the problem eventually.
It is a very weird issue, it seems like there are some segfaults in some kernels, and that causes the issue, if the input size is rounded to 4, than it works, at least for me, with 4 path aggregation, 8 path still doesn't work, I need to debug it more. Updated the master with the rounding of the size of the input images 8cf8b251dbe2d4ebdc37faa20e7457205d03af63, could you try it?
Yes it works, very nice! Thanks a lot for the quick fix.