axolotl icon indicating copy to clipboard operation
axolotl copied to clipboard

DPO Training Fails with Tool Role Messages - KeyError: 'tool'

Open Ofir408 opened this issue 2 months ago • 1 comments

Please check that this issue hasn't been reported before.

  • [x] I searched previous Bug Reports didn't find any similar reports.

Expected Behavior

DPO Training Fails with Tool Role Messages - KeyError: 'tool'

Bug Description

DPO training fails when dataset contains role: "tool" messages, even though Llama 3.1 natively supports tool roles.

Error

KeyError: 'tool'
File "/axolotl/prompt_strategies/dpo/chat_template.py", line 57
"role": role_map[m[message_property_mappings["role"]]],

Reproduction

Data format:

{
  "messages": [
    {"role": "user", "content": "Search for AI info"},
    {"role": "assistant", "tool_calls": [{"id": "call_123", "function": {"name": "search"}}]},
    {"role": "tool", "name": "search", "tool_call_id": "call_123", "content": "Results..."}
  ],
  "chosen": {"role": "assistant", "content": "Based on results..."},
  "rejected": {"role": "assistant", "content": "I don't know."}
}

Config:

base_model: meta-llama/Llama-3.1-8B-Instruct
rl: dpo
chat_template: llama3
datasets:
  - path: data.jsonl
    type: chat_template
    field_messages: "messages"
    field_chosen: "chosen" 
    field_rejected: "rejected"
    message_property_mappings:
      role: role
      content: content
    roles:
      user: ["user"]
      assistant: ["assistant"]
      system: ["system"]
      tool: ["tool"]
    roles_to_train: ["assistant"]

Command: axolotl preprocess config.yaml

Expected Behavior

Should work since:

  • Llama 3.1 supports tool roles natively
  • SFT training works fine with same data/config
  • llama3 chat template supports tools

Root Cause

DPO chat template processor missing tool role in role_map.

Request

Add native tool role support to DPO training to match SFT capabilities.

Current behaviour

I wrote above

Steps to reproduce

I wrote above

Config yaml


Possible solution

No response

Which Operating Systems are you using?

  • [ ] Linux
  • [x] macOS
  • [ ] Windows

Python Version

3.10

axolotl branch-commit

main

Acknowledgements

  • [x] My issue title is concise, descriptive, and in title casing.
  • [x] I have searched the existing issues to make sure this bug has not been reported yet.
  • [x] I am using the latest version of axolotl.
  • [x] I have provided enough information for the maintainers to reproduce and diagnose the issue.

Ofir408 avatar Oct 16 '25 13:10 Ofir408

Hey, as mention in discord, can you try this?

    message_property_mappings:
      role: role
      content: content
+     tool: tool

NanoCode012 avatar Oct 17 '25 13:10 NanoCode012