composer
composer copied to clipboard
Supercharge Your Model Training
## 🚀 Feature Request The documentation here states that stage-3 is not yet supported. https://docs.mosaicml.com/en/v0.10.0/notes/distributed_training.html#deepspeed I tried passing this config to the trainer and it seems to work: ``` deepspeed_config...
## 🚀 Feature Request A general parametrizable and hookable self-attention function that ALiBi or other future algorithm implementations can call out to, in order to change how attention is done....
GLU applies surgery to any model that has architecture modules `BertIntermediate` and `BertOutput`. Not all models have these modules; for example, DistilBert from HuggingFace has different names for the modules....
Checking a docker change
** Environment ** - 8 A100s although also happens on other GPU types ``` No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.4 LTS Release: 20.04 Codename: focal...
## 🚀 Feature Request Currently CheckpointSaver only takes strings for folder and file names, but it would be nice for it to also take Path objects. ## Motivation Those attributes...
With indexes starting to include other projects - like streaming - we need a way to filter out search results for those projects
Adds support for multi-device training using torch_xla. The resnet9 on cifar10 trains fine on 2 GPUs using torch_xla. Some of the issues I ran into while adding this support: -...
This PR removes YAHP from our codebase. Making this PR early to suss out any issues, and form a base for testing. Do not merge this PR until after the...