Knet.jl icon indicating copy to clipboard operation
Knet.jl copied to clipboard

multi-gpu support

Open denizyuret opened this issue 8 years ago • 3 comments

This is a general issue to develop support for various multi-gpu techniques in Knet.

denizyuret avatar Feb 15 '17 14:02 denizyuret

Are there plans to implement multi-GPU support? I'm considering re-implementing my PyTorch models using Knet, but multi-GPU support is an important factor since I'm working on a cluster.

Thanks!

dfenn avatar Jan 07 '20 19:01 dfenn

Last person to work on this was @cangumeli who wrote Knet/src/uva.jl. Below is what he says about the latest state of affairs (also this was before the new multithreading interface). I'd love to work on this with someone who knows the requirements better than me.

We currently have GPU-direct support in Knet. You can access different GPU's memory without copying and you can copy from one GPU to another directly. All you need to do is calling Knet.enableP2P() to use these features. See https://github.com/denizyuret/Knet.jl/blob/master/src/uva.jl for the implementation.

On the other hand, we don't know how to parallelize model execution properly. There are following options to try:

  1. Julia has multi-process parallelism features. This will require synchronization in CPU as GPU-direct doesn't play well with multi-processing.
  1. You can use CUDA streams to execute multiple CUDA kernels in parallel, in one or more GPUs. With this, you can utilize GPU-direct. However, support for CUDA streams must me added to Knet kernels for that, and cudaMalloc and cudaFree cannot be parallelized with CUDA streams. See https://www.juliabloggers.com/multiple-gpu-parallelism-on-the-hpc-with-julia/ for an old example by CUDART people.
  1. You can try Julia threads. With threads, you can parallelize everything, but Knet and many IO features of Julia are not thread-safe. A proper lock that only parallelizes GPU kernels and not Julia execution may help, PyTorch people are doing this with Python GIL as far as I know.
  1. Finally, there are CUDA-aware MPI libraries like MVAPICH that you may try to use with MPI.jl.

denizyuret avatar Jan 08 '20 05:01 denizyuret

Thank you for the update. It sounds like it wouldn't be easily implementable at the moment (especially for a Julia beginner like myself), but could be something that's supported at some point.

I'll keep an eye on this from time to time. I'm interested enough in Julia that I think I'll implement my models anyway, and if at some point multi-GPU becomes easier to incorporate, I'll add it to them.

Thanks for your prompt reply and for all your work with Knet

dfenn avatar Jan 11 '20 00:01 dfenn