llama-recipes
llama-recipes copied to clipboard
Multiple GPUs one node inference
Hello,
I recently finetuned the Llama2 model with one node and multiple GPUS using the following command, aiming to adapt it to my own dataset:
torchrun --nnodes 1 --nproc_per_node 4 examples/finetuning.py --enable_fsdp --model_name /path_to_model_folder/7B --use_peft --peft_method lora --output_dir Path/to/save/PEFT/model
This process generated two files: adapter_config.json
and adapter_model.safetensors
. Could you please provide guidance on how to utilize these files for performing inference tasks?
Additionally, I've chosen to use LlamaforSequenceClassification
for a Multiple Choice Question dataset, diverging from the LlamaforCasualLM
used in your standard codebase. Will this modification be compatible with the finetuning and inference processes?
Thank you for your assistance!
@qianjyM you should be able to use this inference command to run inference on your. model, furthrer options for inference can be found here,
python examples/inference.py --model_name /path_to_model_folder/7B --peft_model output_dir
Hi! It seems that a solution has been provided to the issue and there has not been a follow-up conversation for a long time. I will close this issue for now and feel free to reopen it if you have any questions!