torchtune Setting `expandable_segments:True` in our recipes.

Setting `expandable_segments:True` in our recipes.

Open SalmanMohammadi opened this issue 6 months ago • 7 comments

Context

What is the purpose of this PR? Is it to

[x] add a new feature
[ ] fix a bug
[ ] update tests and/or documentation
[ ] other (please add here)

#1185

Changelog

Adding os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "expandable_segments:True" to all our recipes.

Test plan

Run on 2080 Super 8GB.

(tune) salman@combuter:~/torchtune$ echo $PYTORCH_CUDA_ALLOC_CONF

(tune) salman@combuter:~/torchtune$ tune run full_finetune_single_device --config qwen2/0.5B_full_single_device log_peak_memory_stats=True metric_logger=torchtune.utils.metric_logging.WandBLogger metric_logging.project=torchtune_mem checkpointer.checkpoint_dir=/home/salman/models/Qwen2-0.5B-Instruct tokenizer.path=/home/salman/models/Qwen2-0.5B-Instruct/vocab.json tokenizer.merges_file=/home/salman/models/Qwen2-0.5B-Instruct/merges.txt max_steps_per_epoch=250
...
1|250|Loss: 1.0745091438293457: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 250/250 [22:37<00:00,  5.43s/it]

WandB of successful run with all peak memory stats <= 8GB.

See #1273 for evidence of the other small models and single-device recipes.

[x] run pre-commit hooks and linters (make sure you've first installed via pre-commit install)
[x] add unit tests for any new functionality
[ ] update docstrings for any new or updated methods or classes
[x] run unit tests via pytest tests
[ ] run recipe tests via pytest tests -m integration_test
[ ] manually run any new or modified recipes with sufficient proof of correctness
[x] include relevant commands and any other artifacts in this summary (pastes of loss curves, eval results, etc.)

Aug 12 '24 14:08 SalmanMohammadi

torchtune torchtune copied to clipboard

Setting `expandable_segments:True` in our recipes.

Context

Changelog

Test plan

torchtune
torchtune copied to clipboard