Oscar
Oscar copied to clipboard
fix the deadlock problem when using distributed training in VQA fintune
When using distributed training, the process with local_rank!=0 will not call torch.distributed.barrier() and cause a deadlock.