metaseq
metaseq copied to clipboard
Repo for external large-scale work
## ❓ Questions and Help ### Before asking: 1. search the issues. 2. search the docs. #### What is your question? I wonder how to fit my own data to...
About three weeks ago, I submitted an application for the OPT-175B model. And I haven't received any messages from the community or the developer team. Does anyone know how long...
Numpy 1.24.0 deprecated `np.float`, causing metaseq to throw errors: https://numpy.org/doc/stable/release/1.24.0-notes.html#expired-deprecations Given the number of bytes in the map I'm assuming we meant float32 here, even though np.float actually used 8...
**Patch Description** Since we're doing manual activation checkpointing, we need to have custom backwards for MHA. This patch leverages the flash implementation in xformers. TODO: - [ ] Gate behind...
### Summary of Changes We add an option to convert weights into a new `dtype` while resharding FSDP checkpoints. This helps reduce checkpoint sizes and avoids issues under RAM constraints...
### Summary of Changes The existing script for resharding model parallel parts (i.e. `metaseq/scripts/reshard_model_parallel.py`) loads all checkpoint parts at once and might result in OOM issues under RAM constraints, especially...
**Patch Description** Due to changes of logic around NFS handling, config.yml is no longer being saved along with training runs. Small hack to fix that. **Testing steps** Launched a 1...
**Patch Description** Add @sharannarang to CODEOWNERS **Testing steps** n/a