CrossPoint-DDP
CrossPoint-DDP copied to clipboard
RuntimeError: Timed out initializing process group in store based barrier on rank: 1, for key: store_based_barrier_key:1 (world_size=2, worker_count=8, timeout=0:30:00)
Excuse me, when I was conducting distributed training, the log kept outputting "DEBUG SenderThread: 1236909 [sender. py: send(): 182] send: stats", and finally reported an RuntimeError: Timed out initializing process group in store based barrier on rank: 1, for key: store_based_barrier_key:1 (world_size=2, worker_count=8, timeout=0:30:00). The parser settings are as follows: parser.add_argument('--master_addr', type=str, default='localhost', help='ip of master node') parser.add_argument('--master_port', type=str, default='12355', help='port of master node') Do I need to change these parameters?
Thanks for your reply.