PiPPy
PiPPy copied to clipboard
Meta init llama then pipeline then materialize
Models can be big. Therefore we would need to:
- create the model's "skeleton" on meta device
- partition it so that it can fit on each device, and
- materialize each partition.
This is a demo based on model Llama-2-7b-chat-hf
and its checkpoint on Hugging Face Model Hub.
Before running the script, please download the following files in the same directory as this script:
- pytorch_model.bin.index.json
- pytorch_model-00001-of-00002.bin
- pytorch_model-00002-of-00002.bin
Download link: https://huggingface.co/meta-llama/Llama-2-7b-chat-hf/tree/main
Your directory should look like this:
How to run this script:
$ python meta_init.py
I haven't used a distributed runtime, because I only have a MacBook at hand. But I tried to show how to load each stage module from HF checkpoints. Feel free to modify the script to run in a distributed way by distributing the for loop at [Note 3].
My torch version:
torch 2.5.0.dev20240722
I install it by:
pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/cpu
Cc: @lessw2020 @muellerzr @SunMarc @H-Huang @wconstab @LucasLLC