DialoGPT
DialoGPT copied to clipboard
Training speed is not as stated in README
Hi! I ran the training script on 130 million training instances and I got the following training speed:
1 V100 GPU, FP16 O2, ~14k tokens/sec, ~100 hours 8 V100 GPUs, FP16 O2, ~70k tokens/sec, ~20 hours
However, in the readme, the training speed was much much faster:
![image](https://user-images.githubusercontent.com/57769557/70842701-bcc08b80-1ddb-11ea-9519-d8d230cd3b30.png)
What am I missing? Please help!
Thanks for the feedback! We need to double check the epoch time and get back to you on this.