Cant load llama3 8B into memory
ValueError: Unable to load checkpoint from Meta-Llama-3-8B/original/consolidated.00.pth. while running --config llama3/8B_qlora_single_device
Mind sharing more information on the steps you took to hit this error? For reference, I tried a fresh install and fresh download and unfortunately am unable to repro this.
@kartikayk Are you not able to clone the repo or download the model weight. for downloading the model weight use below command wget --header="Authorization: Bearer HF_TOKEN" https://huggingface.co/datasets/GeneralAwareness/Various/resolve/main/file.zip
I am able to clone the repo without the model weights. for cloning use this please tune download meta-llama/Meta-Llama-3-8B --output-dir Meta-Llama-3-8B --hf-token <HF_TOKEN>
THen I just ran the qlora single device yaml file it is giving me this error
Okay nevermind. Made it work after checking the source code . it did not work for the consolidated.pth but working for the HF checkpiint files . safetensors.
Finetuning is working now but only using safetensors. 'peak_memory_reserved': 12.637437952