iidsample
iidsample
Hi @shijieliu, Thanks for your reply. Is there some way to launch without slurm. Like just on a bunch of nodes. It will be great help if you can provide...
Hi @shijieliu, Thanks for your quick reply. Unfortunately I have been having a lot trouble setting up mpi in the container to launch training. Essentially running mpirun from within the...
Hi, I have been trying to run HugeCTR in distributed mode. When I try to run mpirun with dcn_2node_8gpu.py i get the following error - Runtime error: Error: the MPI...