pix2pix3D icon indicating copy to clipboard operation
pix2pix3D copied to clipboard

Training stuck at the beginning

Open mumukawayi opened this issue 1 year ago • 1 comments

Thanks for this great works! I was trying to follow the training procedure, but it seems the training stucks at the beginning, it keeps showing the following for more than ten minutes and does not proceed any more: `Training for 25000 kimg...

tick 0 kimg 0.0 time 1m 20s sec/tick 5.5 sec/kimg 1372.65 maintenance 74.8 cpumem 5.39 gpumem 21.01 reserved 22.00 augment 0.000` By the way, I was trying to resume from "afhqcats512-128.pkl". Could any body give me some advice about how to move on?

mumukawayi avatar Sep 20 '23 07:09 mumukawayi

Hi, I'm not sure about the problem based on the provided information. However, if you are training with multiple GPUs, you might want to try setting the environment variable NCCL_P2P_DISABLE=1.

dunbar12138 avatar Dec 05 '23 05:12 dunbar12138