LLaMA-Factory issues

Results 548 LLaMA-Factory issues

Sort by recently updated

Qwen72B-Chat，预训练后，合并导出模型，通过Vllm运行报错

预训练是通过配置zero3： ``` { "fp16": { "enabled": "auto", "loss_scale": 0, "loss_scale_window": 1000, "initial_scale_power": 16, "hysteresis": 2, "min_loss_scale": 1 }, "bf16": { "enabled": "auto" }, "optimizer": { "type": "AdamW", "params": { "lr":...

Fire-Star

pending

[Feature]后续会支持llava等多模态模型的训练吗

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction 如题 ### Expected behavior _No response_ ### System Info _No response_ ### Others _No...

Vincent131499

enhancement

pending

DPO训练Lora后，模型的生成结果是乱码

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction 您好，我通过SFT训练了一个能够正常使用的lora模型。现在想进一步通过DPO阶段的训练来优化lora模型的效果。但是我通过以下脚本训练后，输出的结果是乱码（随机重复的数字或字符串）。数据集我反复检查了是没有问题的。请问我是哪里出错了呢？另外，我的目的是继续训练Lora，训练的输出希望是优化后的lora模型。这个参数adapter_name_or_path 我看介绍说的是path to sft checkpoint. 那这里我应该放的是lora模型，还是将lora和base合并后的模型呢？非常感谢！ > CUDA_VISIBLE_DEVICES=0 deepspeed --num_gpus=1 /root/LLaMA-Factory/src/train_bash.py \...

IvoryTower800

pending

可以直接从web上选择checkpoint么，这样就可以随时abort，adapter可以直接选择checkpoint

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction 可以直接从web上选择checkpoint么，这样就可以随时abort，adapter可以直接选择checkpoint ### Expected behavior _No response_ ### System Info _No response_ ### Others _No...

tiger55cn

enhancement

pending

[feature] update README

Brikarl

Bloom跑pt的full时，跑到60steps后，loss趋近于0

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction nohup deepspeed --num_gpus 8 --master_port=5544 src/train_bash.py \ --deepspeed ds_config.json \ --stage pt \ --do_train...

wqc007

pending

What should I do to confirm flash-attn information during the training process ?

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction #1882 When using the qwen model, flash-attn is automatically enabled. How can I confirm...

KelleyYin

pending

PPO使用zero3加载全参训练的奖励模型，奖励模型加载失败。

PPO阶段使用zero2能正常开启训练，但是使用zero3就会出现奖励模型加载出错。训练参数： ![捕获](https://github.com/hiyouga/LLaMA-Factory/assets/51126181/955f515d-e0d2-492c-b8eb-12d7b67eb0a9) 报错如下： ![捕获2](https://github.com/hiyouga/LLaMA-Factory/assets/51126181/e47c3bc6-82a3-4227-b3e5-277b195d2fa7) ds_config_zero3.config如下： ![捕获3](https://github.com/hiyouga/LLaMA-Factory/assets/51126181/137a3880-9648-46d8-8f48-1983f27b5ab2)

Luoxiaohei41

bug

pending

Help: Need guidance on correctly saving and loading optimizer, scheduler, and scaler during training checkpoints

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction In my training script, I'm attempting to resume training from a checkpoint. However, upon...

GavinZhao19

pending

12/24/2023 09:04:10 - INFO - llmtuner.tuner.core.loader - trainable params: 0 || all params: 7069016064 || trainable%: 0.0000 Killed

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction ***** train metrics ***** epoch = 3.0 train_loss = 1.6394 train_runtime = 3:33:32.94 train_samples_per_second...

1Jenifer

pending

LLaMA-Factory
LLaMA-Factory copied to clipboard

Metadata

Qwen72B-Chat，预训练后，合并导出模型，通过Vllm运行报错

[Feature]后续会支持llava等多模态模型的训练吗

DPO训练Lora后，模型的生成结果是乱码

可以直接从web上选择checkpoint么，这样就可以随时abort，adapter可以直接选择checkpoint

[feature] update README

Bloom跑pt的full时，跑到60steps后，loss趋近于0

What should I do to confirm flash-attn information during the training process ?

PPO使用zero3加载全参训练的奖励模型，奖励模型加载失败。

Help: Need guidance on correctly saving and loading optimizer, scheduler, and scaler during training checkpoints

12/24/2023 09:04:10 - INFO - llmtuner.tuner.core.loader - trainable params: 0 || all params: 7069016064 || trainable%: 0.0000 Killed

← Metadata

Owner

Metadata

LLaMA-Factory LLaMA-Factory copied to clipboard

Metadata

← Metadata

Owner

Metadata

LLaMA-Factory
LLaMA-Factory copied to clipboard