llama-recipes Multiple GPUs one node inference

Multiple GPUs one node inference

Open qianjyM opened this issue 1 year ago • 1 comments

Hello,

I recently finetuned the Llama2 model with one node and multiple GPUS using the following command, aiming to adapt it to my own dataset:

torchrun --nnodes 1 --nproc_per_node 4 examples/finetuning.py --enable_fsdp --model_name /path_to_model_folder/7B --use_peft --peft_method lora --output_dir Path/to/save/PEFT/model

This process generated two files: adapter_config.json and adapter_model.safetensors. Could you please provide guidance on how to utilize these files for performing inference tasks?

Additionally, I've chosen to use LlamaforSequenceClassification for a Multiple Choice Question dataset, diverging from the LlamaforCasualLM used in your standard codebase. Will this modification be compatible with the finetuning and inference processes?

Thank you for your assistance!

Jan 09 '24 00:01 qianjyM

@qianjyM you should be able to use this inference command to run inference on your. model, furthrer options for inference can be found here,

python examples/inference.py --model_name /path_to_model_folder/7B --peft_model output_dir

Jan 11 '24 04:01 HamidShojanazeri

Hi! It seems that a solution has been provided to the issue and there has not been a follow-up conversation for a long time. I will close this issue for now and feel free to reopen it if you have any questions!

Jun 03 '24 21:06 wukaixingxp

llama-recipes llama-recipes copied to clipboard

Multiple GPUs one node inference

llama-recipes
llama-recipes copied to clipboard