LLaMA-Factory icon indicating copy to clipboard operation
LLaMA-Factory copied to clipboard

Unify Efficient Fine-Tuning of 100+ LLMs

Results 548 LLaMA-Factory issues
Sort by recently updated
recently updated
newest added

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction 1. https://github.com/abacusai/smaug 2. https://arxiv.org/abs/2402.13228 ### Expected behavior _No response_ ### System Info _No response_...

pending

想请教下, 在gsm8k数据集上采取完全相同的设置,仅修改精度bf16,fp16 和 学习率 、epoch等 训练均已收敛,存在如下三种效果:(多次实验,很普遍的现象。训练和推理用的是同一个模版) ![f47b2665288d45292595c44da663255f](https://github.com/hiyouga/LLaMA-Factory/assets/43243253/2e2418fc-c263-4e7a-aa7b-8384442fa250) ![eece5e1823159f4c27a39bb0a7a93fd8](https://github.com/hiyouga/LLaMA-Factory/assets/43243253/e193bb26-b711-46af-8cfd-451fd22dd5a0) ![e9d98c1b6275031ce73e8521a0e78145](https://github.com/hiyouga/LLaMA-Factory/assets/43243253/15fd5329-bdc3-4501-a23c-1bbfe7245b86)

pending

Dear Developers, I'm delighted to inform you that the documentation update for Python scripts has been successfully completed. The updated documentation provides clear explanations of function parameters, return types, and...

pending

如题,打算引入 trl 库的KTOTrainer吗?

enhancement
pending

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction We observe that across models and datasets using Zero3 with bf16 yields much higher...

pending

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction deepspeed --num_gpus 2 src/train_bash.py \ --deepspeed ds_config.json \ --stage sft \ --do_train \ --model_name_or_path...

pending

https://huggingface.co/docs/trl/main/en/dpo_trainer#downsides-to-merging-qlora-before-dpo-approach-2 https://github.com/jondurbin/qlora/blob/main/qmerge.py

pending

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction Multiturn datasets are supported in the original dpo example. Wouldn't it be appropriate for...

pending

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction 看目前sft_packing的实现只是单纯将不同的单轮sft数据拼接到一起,然后分别计算target部分的loss def preprocess_packed_supervised_dataset( examples: Dict[str, List[Any]], tokenizer: "PreTrainedTokenizer", template: "Template", data_args: "DataArguments", ) ->...

pending

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction I would like to know if there is an option to evaluate the model...

pending