composer icon indicating copy to clipboard operation
composer copied to clipboard

Adding in FSDP Docs

Open bcui19 opened this issue 3 years ago • 1 comments
trafficstars

What does this PR do?

Adding FSDP docs to composer

Before submitting

  • [ ] Have you read the contributor guidelines?
  • [ ] Is this change a documentation change or typo fix? If so, skip the rest of this checklist.
  • [ ] Was this change discussed/approved in a GitHub issue first? It is much more likely to be merged if so.
  • [ ] Did you update any related docs and document your change?
  • [ ] Did you update any related tests and add any new tests related to your change? (see testing)
  • [ ] Did you run the tests locally to make sure they pass?
  • [ ] Did you run pre-commit on your change? (see the pre-commit section of prerequisites)

bcui19 avatar Oct 13 '22 22:10 bcui19

@bcui19 here's a pseudocode snippet that can be used as an example for FSDP:

import torch.nn as nn
from composer import Trainer

class Block (nn.Module):
    ...

class Model(nn.Module):
    def __init__(self, n_layers):
        super().__init__()
        self.blocks = nn.ModuleList([
            Block(...) for _ in range(n_layers)
        ]),
        self.head = nn.Linear(...)

    def forward(self, inputs):
        ...

    # FSDP Wrap Function
    def fsdp_wrap_fn(self, module):
        return isinstance(module, Block)

    # Activation Checkpointing Function
    def activation_checkpointing_fn(self, module):
        return isinstance(module, Block)


class MyComposerModel(ComposerModel):

    def __init__(self, n_layers):
        super().__init__()
        self.model = Model(n_layers)
        ...

    def forward(self, batch):
        ...

    def eval_forward(self, batch, outputs=None):
        ...

    def loss(self, outputs, batch):
        ...

    ...


composer_model = MyComposerModel(n_layers=3)

fsdp_config = {
    'sharding_strategy': 'FULL_SHARD',
    'min_params': 1e8,
    'cpu_offload': False, # Not supported yet
    'mixed_precision': 'DEFAULT',
    'backward_prefetch': 'BACKWARD_POST',
    'activation_checkpointing': False,
    'activation_cpu_offload': False,
    'verbose': True
}


trainer = Trainer(
    model=composer_model,
    fsdp_config=fsdp_config,
    ...
)

trainer.fit()

abhi-mosaic avatar Oct 14 '22 21:10 abhi-mosaic