dlib icon indicating copy to clipboard operation
dlib copied to clipboard

CUDNN Runtime detection

Open xsacha opened this issue 7 years ago • 9 comments

Would it be possible to have CUDNN as a runtime detection so that if the user does not have an NVidia video card, it falls back to CPU-based BLAS libraries (like Intel MKL)?

xsacha avatar Sep 28 '17 07:09 xsacha

It's possible to write code like that, but that isn't how the dlib code is setup now. If someone wants to make a PR that does this I'm fine with it so long as they figure out how to make it not confusing to users, which is the central challenge. A very large number of dlib users are just beginning to program and have no understanding of what linking is, which can be attested to by hundreds of questions posted about dlib. So somehow doing this in a way that doesn't increase the number of confused questions is the central challenge.

davisking avatar Sep 28 '17 09:09 davisking

So how about just always fall back to the internal CBLAS if CUDNN doesn't exist. As a stepping stone, we know this adds no extra linkage (it is already compiled in) and it is a sane fallback because otherwise the application just crashes.

xsacha avatar Sep 28 '17 23:09 xsacha

Yes, that's the general idea. But there are a lot of little details. How does the application "find out if cudnn exists?" in a way that doesn't lead to user confusion. How does a user tell the application where to find cuda and deal with the runtime linking options?

davisking avatar Sep 29 '17 00:09 davisking

Rather you would find if a CUDA core exists. NVidia has a function for this in the library.

You would try cuInit and then cuDeviceGetCount != 0 and then cuDeviceComputeCapability

xsacha avatar Sep 29 '17 00:09 xsacha

Yes, you could do that. It wouldn't be very difficult to do since all the calls are hidden inside the code in dlib::tt which switches to either the CPU or GPU from there. Making that runtime switchable is probably a good idea and something someone should submit in a pull request.

However, my main point is about dealing with systems that don't have cuda and/or cudnn installed. That's the main pain point people complain about. They compile the application with cuda code in it. Then they try to run it on another computer that doesn't have cuda installed and they get a runtime linking error. My impression, based on user feedback, is that that is far and away the central problem.

Anyway, if you want to submit a PR that adds a runtime switch to dlib::tt that shunts the codepath back to the CPU based on the state of the switch that would be cool :)

davisking avatar Sep 29 '17 01:09 davisking

Not having cuda/cudnn is a separate issue (not related to this). On my side I just distribute the cuda DLL with my app (Note: for CuDNN we needed permission from NVIDIA) and they don't actually need it installed.

xsacha avatar Sep 29 '17 02:09 xsacha

Right, I know. I'm just thinking about the sorts of questions I'm likely to get. I've become kind of crotchety as a result of so many ignorant questions :(

Anyway, what you are proposing sounds good. You should submit a PR :)

davisking avatar Sep 29 '17 02:09 davisking

It would be nice if there was a base implementation class that you could implement CPUImpl or GPUImpl on top of. Right now everything is just in #ifdef's that is a bit nasty to use, in dlib::tt.

xsacha avatar Nov 06 '17 06:11 xsacha

You don't need a class. You could just make a global function that returns a bool like, bool use_cuda() and put it in the dlib::tt namespace. Then make all the routines in tt switch based on that. Anyway, you should submit a PR :)

davisking avatar Nov 06 '17 10:11 davisking