What is the expected speedup when using OpenCL?
This is an excellent question that opens up plenty of passionate debate. When I am renting hardware to run my own models, I prefer renting CPU only hardware without GPU. The motivations are: AVX capable CPUs are cheap and I don't risk exceeding VRAM.
Before you consider me crazy, I recommend having a look at:
- https://deci.ai/blog/close-gap-cpu-performance-gpu-deep-learning-models/
- https://arxiv.org/abs/1903.03129
- https://minimaxir.com/2017/07/cpu-or-gpu/
To reply to your question, small non convolutional models might be slower on GPU. I would use GPU only on bigger models with convolutions. I would expect improvement from 2x to 8x in convolutional models using GPU. My own models are trained on CPU only environments because I have found better price x performance on CPU.
Depending on where I am renting hardware, I can get 20 CPU cores for the cost of 1 GPU. Anyway, one model can be price effective on GPU and then the next model may not be. It's a moving target.