raisr icon indicating copy to clipboard operation
raisr copied to clipboard

how can we make it GPU optimized

Open ELHoussineT opened this issue 7 years ago • 5 comments

if someone is interested we can work together if a GPU implementation is not already there.

Thank you

ELHoussineT avatar Sep 22 '17 15:09 ELHoussineT

I don't have the programming experience necessary, but I'd love to see that exist.

gaara100 avatar Feb 07 '18 04:02 gaara100

Yea those nested for loops just screams "parallelize me!". Looks like there are python hooks for nvidia cuda. What are you looking to thread? The image processing should be a breeze - either write our own since we'll have to bust the pillow api or find existing threaded variants for a lot of these convolutions/linear math.

jdtran avatar Feb 08 '18 03:02 jdtran

OpenCV automatically utilizes the GPU with OpenCL for image processing if available (it can also use CPU OpenCL runtimes to improve its performance). There are already parts in the code using OpenCV; just rewriting the image preprocessing stuff to use OpenCV would most likely bring a huge performance boost. Additionally some parallelization of for loops would probably also help a bit (while "Processing image … of … (train/…)" it uses only one cpu core at the moment).

BlauerHunger avatar Jul 05 '18 20:07 BlauerHunger

Did anyone manage to implement this in the end? I noticed that OpenCV is a requirement and I'm not sure whether you were discussing an older version above.

StephenJPereira avatar Jan 28 '19 16:01 StephenJPereira

I wrote an implementation of the inference half in a OpenGL compute shader. Performance is okay but could probably be improved; something like 8ms to upscale from 1080p to 4K on a GTX 1070. I stored the filterbank in a texture, but it's plausible that you could get better perf by storing it in a UBO and tiling in shared memory.

AnimatedRNG avatar Aug 14 '19 01:08 AnimatedRNG