LLaMA-Factory issues

Results 548 LLaMA-Factory issues

Sort by recently updated

[Feature] 支持DPO-Positive (DPOP)，DPO改进版

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction 1. https://github.com/abacusai/smaug 2. https://arxiv.org/abs/2402.13228 ### Expected behavior _No response_ ### System Info _No response_...

WangRongsheng

pending

llama2-7B 全量微调中存在异常？

想请教下，在gsm8k数据集上采取完全相同的设置，仅修改精度bf16,fp16 和学习率、epoch等训练均已收敛，存在如下三种效果：（多次实验，很普遍的现象。训练和推理用的是同一个模版） ![f47b2665288d45292595c44da663255f](https://github.com/hiyouga/LLaMA-Factory/assets/43243253/2e2418fc-c263-4e7a-aa7b-8384442fa250) ![eece5e1823159f4c27a39bb0a7a93fd8](https://github.com/hiyouga/LLaMA-Factory/assets/43243253/e193bb26-b711-46af-8cfd-451fd22dd5a0) ![e9d98c1b6275031ce73e8521a0e78145](https://github.com/hiyouga/LLaMA-Factory/assets/43243253/15fd5329-bdc3-4501-a23c-1bbfe7245b86)

huangjf11

pending

Documentation update of some python scripts

Dear Developers, I'm delighted to inform you that the documentation update for Python scripts has been successfully completed. The updated documentation provides clear explanations of function parameters, return types, and...

louisbrulenaudet

pending

Integration of KTOTrainer from trl?

如题，打算引入 trl 库的KTOTrainer吗？

wsp317

enhancement

pending

Erroneous/high loss with DeepSpeed Zero3 and bf16

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction We observe that across models and datasets using Zero3 with bf16 yields much higher...

mnmueller

pending

单机多卡保存模型的时候忽然显存增加

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction deepspeed --num_gpus 2 src/train_bash.py \ --deepspeed ds_config.json \ --stage sft \ --do_train \ --model_name_or_path...

DinahD967102ick

pending

[TODO] Update merging QLoRA workaround

https://huggingface.co/docs/trl/main/en/dpo_trainer#downsides-to-merging-qlora-before-dpo-approach-2 https://github.com/jondurbin/qlora/blob/main/qmerge.py

hiyouga

pending

DPO Multi-Turn support with packing?

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction Multiturn datasets are supported in the original dpo example. Wouldn't it be appropriate for...

kostum123

pending

sft_packing实现的问题

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction 看目前sft_packing的实现只是单纯将不同的单轮sft数据拼接到一起，然后分别计算target部分的loss def preprocess_packed_supervised_dataset( examples: Dict[str, List[Any]], tokenizer: "PreTrainedTokenizer", template: "Template", data_args: "DataArguments", ) ->...

dyh1996

pending

[Feature request / help] Evaluate on different dataset to training dataset

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction I would like to know if there is an option to evaluate the model...

Peter-Devine

pending

LLaMA-Factory
LLaMA-Factory copied to clipboard

Metadata

[Feature] 支持DPO-Positive (DPOP)，DPO改进版

llama2-7B 全量微调中存在异常？

Documentation update of some python scripts

Integration of KTOTrainer from trl?

Erroneous/high loss with DeepSpeed Zero3 and bf16

单机多卡保存模型的时候忽然显存增加

[TODO] Update merging QLoRA workaround

DPO Multi-Turn support with packing?

sft_packing实现的问题

[Feature request / help] Evaluate on different dataset to training dataset

← Metadata

Owner

Metadata

LLaMA-Factory LLaMA-Factory copied to clipboard

Metadata

← Metadata

Owner

Metadata

LLaMA-Factory
LLaMA-Factory copied to clipboard