LLaMA-Factory 偏好训练，如何使用ShareGPT格式数据集

偏好训练，如何使用ShareGPT格式数据集

Open binganao opened this issue 9 months ago • 1 comments

Reminder

[X] I have read the README and searched the existing issues.

Reproduction

WebUI 运行

Expected behavior

想要对齐 Chat 模型

我看了相关 Issue，提到 把历史对话中每轮都拆出来，构造 chosen 和 rejected https://github.com/hiyouga/LLaMA-Factory/issues/3495

还是不太理解这个操作应该如何完成，能否提供一个比较完整的例子，比如下面这个数据，应该如何处理成一个支持偏好对齐的数据

[
    {
        "role": "system",
        "content": "system_message"
    },
    {
        "role": "user",
        "content": "user_message"
    },
    {
        "role": "assistant",
        "content": "assistant_message"
    },
    {
        "role": "user",
        "content": "user_message"
    },
    {
        "role": "assistant",
        "content": "assistant_message"
    },
]

System Info

No response

Others

No response

May 16 '24 09:05 binganao

似乎偏好数据集不支持这个格式

May 17 '24 09:05 Victoriaheiheihei

似乎偏好数据集不支持这个格式

看起来好像是的，我的目的是偏好优化多轮对话，讲道理应该能支持我的目标，但是如何构造数据集我不太懂

May 19 '24 04:05 binganao

已经支持 https://github.com/hiyouga/LLaMA-Factory/blob/main/data/README_zh.md#%E5%81%8F%E5%A5%BD%E6%95%B0%E6%8D%AE%E9%9B%86-1

May 19 '24 08:05 hiyouga

LLaMA-Factory LLaMA-Factory copied to clipboard

偏好训练，如何使用ShareGPT格式数据集

Reminder

Reproduction

Expected behavior

System Info

Others

LLaMA-Factory
LLaMA-Factory copied to clipboard