LLaMA-Factory icon indicating copy to clipboard operation
LLaMA-Factory copied to clipboard

Unify Efficient Fine-Tuning of 100+ LLMs

Results 548 LLaMA-Factory issues
Sort by recently updated
recently updated
newest added

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction ```bash cd LLaMA-Factory && HF_ENDPOINT=https://hf-mirror.com accelerate launch src/train_bash.py \ --stage sft \ --do_train \...

pending

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction 启动命令 `torchrun --nproc_per_node $NPROC_PER_NODE \ --nnodes $NNODES \ --node_rank $RANK \ --master_addr $MASTER_ADDR \...

pending

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction python src/export_model.py \ --model_name_or_path ../../../workspace/Llama/Qwen-14B-Chat-Int4 \ --adapter_name_or_path ../LLaMA-Factory-main-bk/path_to_sft14bint4_checkpoint/checkpoint-7000 \ --template default \ --finetuning_type lora...

pending

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction python src/cli_demo.py \ --model_name_or_path /hy-tmp/model/gemma-7b \ --template gemma \ --finetuning_type lora \ --adapter_name_or_path /home/lzl/python-workspace/llama-efficient-tuning/sft_checkpoint_gemma/checkpoint-200...

pending

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction deepspeed --include="localhost:0,1,2,3,4,5,6,7" src/train_bash.py \ --stage sft \ --do_train \ --model_name_or_path gemma-7b \ --dataset XXX.json...

pending

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction ``` CUDA_VISIBLE_DEVICES=4 python src/export_model.py \ --model_name_or_path saves/export/Oral_calculation/1-grade/Qwen1.5-4B-Chat/SFT_2024-03-08 \ --template qwen \ --finetuning_type lora \...

pending

### Reminder - [x] I have read the README and searched the existing issues. ### Reproduction 我使用LoRA微调了ChatGLM3模型。微调后直接使用web界面中的export功能导出(也就是合并)模型。我有几点疑问: 1. 在Web界面导出时,合并的是所提供适配器路径下的哪一个checkpoints,是最后一个吗? 2. 如果我希望指定checkpoints来合并,是不是必须通过命令行来实现?也就是指定参数 `--adapter_name_or_path path_to_checkpoint` 到相应的checkpoint 3. 我想从某个指定的checkpoint开始继续训练,如果使用web界面,似乎无法指定checkpoint。它是从最后一个checkpoint开始训练的吗?如果想实现这一功能,是不是必须通过命令行实现?也就是指定 `--adapter_name_or_path` 到相应的checkpoint ### Expected...

pending

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction #!/bin/bash CUDA_VISIBLE_DEVICES=0,1 python cli_demo.py \ --model_name_or_path ../model/Qwen1.5-72B-Chat-GPTQ-Int4 \ --template qwen \ --use_fast_tokenizer True \...

pending

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction 通过accelerate config设置ds zero3的时候,提示以下错误,似乎和ds有某些兼容问题。另外求教galore训练可以多卡训练的的pp/tp方案?非常感谢 deepspeed版本:deepspeed 0.12.5+2ce6bf8c llama-factory已经git pull 至最新版本 训练sh脚本和Log如下所示。 ### Expected behavior ``` accelerate...

pending

使用Qwen1.5-14B-int8-Chat进行微调,完成后不释放显存 需要kill train_web才可以 使用加载模型功能,卸载模型后,也不释放显存 LLama_factory版本0.5.0和0.5.3都存在该问题 ![image](https://github.com/hiyouga/LLaMA-Factory/assets/14145007/e61abe42-af42-41c8-8c8d-d7653a499fb9) ![image](https://github.com/hiyouga/LLaMA-Factory/assets/14145007/440446b4-99cb-4360-b65c-0ac3683fbb7f) 但是使用chatglm3或者Qwen-7b就没这个问题

pending