drjit icon indicating copy to clipboard operation
drjit copied to clipboard

Choose CUDA device on a multi-GPU machine

Open tvogels opened this issue 10 months ago • 2 comments

Hello!

I was wondering if there is an API to select the GPU to use on a multi-GPU machine? I would like to use drjit in an ML training loop with model parallelism, so I cannot easily guide it using CUDA_VISIBLE_DEVICES.

tvogels avatar Mar 03 '25 13:03 tvogels

Hey @tvogels :)

Maybe it would just be a matter of exposing set_device() through bindings?

https://github.com/mitsuba-renderer/drjit/blob/7cdc80751572402ae3e1f060030b59fd01393ee7/include/drjit/array_router.h#L1468-L1470

I've never used it myself though, so I don't know what it entails.

merlinND avatar Mar 03 '25 14:03 merlinND

Just saw this while coming here to post an issue, this might also be relevant: https://github.com/mitsuba-renderer/drjit-core/issues/64. My solution was to use CUDA_VISIBLE_DEVICES to restrict Dr.Jit to the appropriate device and then use DDP to parallelise.

Microno95 avatar Apr 01 '25 15:04 Microno95