metaseq
metaseq copied to clipboard
Re-release consolidated OPT / OPT-IML checkpoints
After https://github.com/facebookresearch/metaseq/pull/459 and https://github.com/facebookresearch/metaseq/pull/556, we can now release updated checkpoints that are consolidated from FSDP shards with different model parallelism as well. We should update all of our checkpoints as a start to help address some of the following painpoints that users are facing:
- https://github.com/facebookresearch/metaseq/issues/595
- https://github.com/facebookresearch/metaseq/issues/594
- https://github.com/facebookresearch/metaseq/issues/574
- https://github.com/facebookresearch/metaseq/issues/567
- https://github.com/facebookresearch/metaseq/issues/475
- https://github.com/facebookresearch/metaseq/issues/407
- https://github.com/facebookresearch/metaseq/issues/233
- https://github.com/facebookresearch/metaseq/issues/211
and previous issues:
- https://github.com/facebookresearch/metaseq/issues/31
- https://github.com/facebookresearch/metaseq/issues/125
We have internal consolidated versions for 2.7B and 30B to check against, and will also need to confirm that generation looks roughly sane after consolidation.
Yeah looks like it, at least tangentially - the loading logic there could probably do with simplifying. It should be possible to identify naming convention by just reading the checkpoint directory.