composer icon indicating copy to clipboard operation
composer copied to clipboard

Set offload_to_cpu True for sharded and local

Open eracah opened this issue 2 years ago • 0 comments
trafficstars

What does this PR do?

sets the default for sharded and local state dicts to offload_to_cpu=True. This helps avoid OOMs for large models when saving sharded checkpoints

Testing

Ran manual test of saving 30B checkpoints

What issue(s) does this change relate to?

https://github.com/mosaicml/llm-foundry/issues/367

eracah avatar Jun 30 '23 04:06 eracah