vector-quantize-pytorch
vector-quantize-pytorch copied to clipboard
How to gather cluster_size and embed_sum when use DataParallel (DP)
I notice the code can gather data for EMA in DDP, but I filed it when I use DP, because distributed.all_reduce need distributed.init_process_group firstly. How to gather data in DP
In DP data cannot be gathered. Please run the same job you run in DP with DDP. I had the same issues in my work, there is no workaround.