dlib icon indicating copy to clipboard operation
dlib copied to clipboard

Is it possible to use AMD HIP to convert CUDA for "dlib" dnn face recognition to use AMD GPU?

Open rajhlinux opened this issue 3 years ago • 1 comments

If you aren't reporting a bug or problem with dlib then delete this template and write whatever you want here.

Hello, I'm amazed how easy and fast it was to install "dlib" and get the face recognition running right from the c++ source via github on FreeBSD 13.1. Great job for keeping things simple, organized and in working order.

With that said, the face dnn face recognition example provided by "dlib" is extremely useful and accurate, I'm building a surveillance PC DVR system running 8 analog CCTV cameras. I'm able to get H.265(HEVC) transcoding working on AMD GPUs thanks to VA-API and FFMPEG.

I know in the machine vision world only Nvidia GPUs are religiously used. I spoke with the OpenCL group on Freenode(Libera) and they were saying a totally different story about AMD performance in machine vision should be just in par with Nvidia and that "no one" has no idea what they'e doing to implement machine vision GPU acceleration on AMD GPUs. They recommended to use AMD's HIP to convert CUDA code and that the performance should be "similar" (not sure how much of performance percentage difference is "similar").

Now is there any recommendations or any advice in which exactly how I can take "dlib" and convert it's CUDA face recognition implementation using AMD HIP to OpenCL so that it can run on AMD GPUs?

My PC Build: GPU - AMD Radeon RX 580 (Supported by ROCm and HIP) CPU - AMD FX-8350 OS - FreeBSD 13.1 Programming Language: C++11 Compiler: Clang 13 and GCC 11

Also one more question is how can I get USB webcam stream working on the dnn face recognition example in real time?

Thanks.

rajhlinux avatar Sep 17 '22 01:09 rajhlinux

Anyone telling you it's easy to run cuda code on other platforms is trying to sell you something and not being honest.

There is a ton of stuff in the cuda SDK as well as many important closed source libraries like cudnn, cublas and the like. It's a ton of software that somehow would have to be replicated exactly. To say nothing of nvcc (the cuda compiler) which somehow needs to be replaced with some other compiler that can compile cuda code.

When someone is saying this stuff they are invariably saying "well it is theoretically possible to rewrite the software from scratch to accomplish the same result". Which is always true.

The reason nvidia is popular for this stuff is because they made a really nice SDK for doing GPU programming. There are all these useful libraries, nvcc is great, it integrates with c++ super well, and the whole dev experience is just great. So people build on all that.

I would not characterize OpenCL the same way.

I would use opencv for reading from a web cam. Like I the dlib example that does that.

davisking avatar Sep 17 '22 13:09 davisking

Thank you for your reply.

I just wasted a whole good week trying to figure out how to get CUDA code working on AMD GPUs, then I just realized yesterday I'm completely wasting my time since Nvidia GPU's specifically have "Tensor Cores" which are same marketing terminology as to "APU/NPU/AI Cores/ML Cores" used by others SoC companies selling chips that have these "AI cores", which are ASIC specific for matrix multiplication designed to physical silicon computer architecture hardware embedded into the chips to be utilized what we all want to do in boosting performance with accelerated computation for Machine Vision/Deep Learning/Machine Learning. It's really sad that AMD still till this day have not applied any kinds of Tensor Core silicon into their products, I guess I should just buy a Nvidia card or lower costs SoCs like "Mediatek" to use the right tools for the job which have these important Tensor Core silicon embedded into the chips for accelerated AI processing.

I was super frustrated why the entire AI/Machine Learning/Deep Learning/Machine Vision community all uses NVIDIA GPUs, it now all makes sense, simply due to the fact of hardware (also software, but software is nothing without proper hardware to begin with).

It seems no one knows much about this important fact (like myself) and simply buys an AMD GPU thinking OpenCL will be their savior using regular GPU cores when "Tensor cores" outperforms GPU cores for AI tasks.

rajhlinux avatar Sep 19 '22 01:09 rajhlinux

Well, I'm not saying AMD doesn't have capable hardware. It's just about the software ecosystem around it.

davisking avatar Sep 19 '22 12:09 davisking