meshed-memory-transformer
meshed-memory-transformer copied to clipboard
Question about GPU numbers
Hi, thanks for your great work!
We tried to modify the code to use 4 GPUs (2080T) for training, and found that the time did not shorten much. Have you tried training with more than 1 GPU, and whether there is an advantage in time.