metaseq
metaseq copied to clipboard
check model parallel ranks have consistent data
- check parallel ranks have consistent data
- remove a potential race condition when saving checkpoints that lets sequences_consumed get a head of number of iterations
- Add code to fixup potentially broken data loaders on a restart