meatybobby

Results 4 comments of meatybobby

The purpose of running a step on first batch is for Horovod. According to [Horovod's example](https://horovod.readthedocs.io/en/stable/tensorflow.html), we need to broadcast after first batch to initialize variables and the broadcast need...

Yes, that is a typo in the document. Thanks for the correction. The pretraining for BART is still working in progress. We will add the pretraining feature in the future.

Yes, we will update README later. And A100 config should be 80GB, we will update this as well. Thank you for correcting.