torchtune icon indicating copy to clipboard operation
torchtune copied to clipboard

Having issues on Running Llama-3-70B

Open BedirT opened this issue 1 year ago • 1 comments

Hi,

I am trying to run fine-tuning Llama-3-70B-Instruct on Sagemaker Notebook with p4de.24xlarge instance. I am not sure whats wrong, but it seems like package cannot see torchtune.models.llama3.lora_llama3_70b module.

The command I am running is:

tune run --nproc_per_node 8 lora_finetune_distributed --config Llama-3-70B-lora.yaml

I installed the torchtune using pip and here are my torch versions:

torch==2.2.2
torchao==0.1
torchaudio==2.2.2
torchtune==0.1.1
torchvision==0.17.2

Since Sagemaker environments are handled by Conda, I am also using a Conda environment in my setup. The error I get:

AttributeError: module 'torchtune.models.llama3' has no attribute 'lora_llama3_70b'. Did you mean: 'lora_llama3_8b'?

BedirT avatar May 10 '24 22:05 BedirT

Unfortunately this feature is not in our stable package release. Can you install the nightly build and see if it fixes the issue?

kartikayk avatar May 10 '24 22:05 kartikayk

It works for me with the following torch version

torch                    2.3.1+cu121
torchao                  0.1
torchaudio               2.3.1+cu121
torchtune                0.2.0.dev20240623+cu121
torchvision              0.18.1+cu121

JadarTheObscurity avatar Jun 23 '24 15:06 JadarTheObscurity

@BedirT were you able to figure this out? I am gonna close this issue as I think @kartikayk's suggestion should resolve it for you. If not, please feel free to reopen.

ebsmothers avatar Jun 24 '24 13:06 ebsmothers