cufinufft icon indicating copy to clipboard operation
cufinufft copied to clipboard

Multi-GPU Support in Python

Open JBlaschke opened this issue 5 years ago • 7 comments

Hi all y'all

do we have multi-gpu support for the python front-end? I.e. is there a way to inject cudaSetDevice (or gpuDeviceInit) to calls to the python front end?

I am not looking for fancy multi-gpu support (i.e. I don't need to share data between gpus -- yet), but just a way to have one plan run on one GPU and the other or another.

Please let me know If this isn't documented (I did a quick git grep and couldn't find anything), but already implemented. Otherwise I can put in some time to implement it for another project (if y'all want it implemented in a certain way, let me know).

Cheers! 🍺

JBlaschke avatar Sep 28 '20 05:09 JBlaschke

No, there is nothing of the sort in either the Python or C++ layer. I think there is a related issue #26 .

If you are already managing the multi-gpu style of code that would be required on your application side, the suggestion there might cover that case (if it was implemented). You would be managing your own contexts and data, simply passing the context through. Generally speaking, multi gpu can get a little tricky, not sure if you're already invested in or not....

Otherwise, if you just want to naively use multiple GPUs concurrently in an uncoupled way, you may set CUDA_VISIBLE_DEVICES before separate python processes. If you're just bulk processing a pile of data, you may find that is simple & effective.

garrettwrong avatar Sep 28 '20 12:09 garrettwrong

Thanks @garrettwrong -- for some reason I didn't see #26 even though it's plainly there.

A little more context: most of the parallelism that we're interested in is data-parallel. We're targeting machines where are are several GPUs available per CPU (who isn't?). So while setting CUDA_VISIBLE_DEVICES can be hacked into our workflow, this is a bit of a hack. We're using mpi4py to manage load balancing, and therefore messing about with environment variables is not fun.

This isn't super-high priority, so I'll keep tracking this and let me know if you need me to implement anything. Or if you prefer for me to keep my grubby hands off the code.

JBlaschke avatar Sep 28 '20 15:09 JBlaschke

Should we continue this discussion in #26? Or do you want to keep this one around?

JBlaschke avatar Sep 28 '20 15:09 JBlaschke

@JBlaschke I think it would be a useful addition if you'd like to give it a try. As @garrettwrong says, though, it might take a lot of work to get right.

janden avatar Sep 28 '20 15:09 janden

Yea @janden -- since we have a specific use case in mind (data-parallel workflows managed by python), I'll start there and keep everything in my fork. I'll keep y'all posted on my progress -- FYI: Depending on how high the priority is for my collaborators, this can be slow or fast. So let me know if you need me to speed up 😉

JBlaschke avatar Sep 28 '20 15:09 JBlaschke

Sounds good. I don't think we're in a rush on our end, but I'm curious to see what you come up with. Let me know how it goes.

janden avatar Sep 28 '20 17:09 janden

I've put together a simple solution to multi-gpu support here: #71

Since we're controlling our workflows from python, this PR is trying to be as minimal as possible.

JBlaschke avatar Oct 25 '20 22:10 JBlaschke