axolotl icon indicating copy to clipboard operation
axolotl copied to clipboard

Allow passing args to `apply_chat_template`

Open BitPhinix opened this issue 6 months ago • 5 comments

⚠️ Please check that this feature request hasn't been suggested before.

  • [x] I searched previous Ideas in Discussions didn't find any similar feature requests.
  • [x] I searched previous Issues didn't find any similar feature requests.

🔖 Feature description

Allow passing args like enable_thinking to tokizer.apply_chat_template to enable SFT without thinking traces for models like qwen3

✔️ Solution

Either add an option to explicitly set enable_thinking in the config, or add a more generalized way to pass down args to tokizer.apply_chat_template

❓ Alternatives

No response

📝 Additional Context

No response

Acknowledgements

  • [x] My issue title is concise, descriptive, and in title casing.
  • [x] I have searched the existing issues to make sure this feature has not been requested yet.
  • [x] I have provided enough information for the maintainers to understand and evaluate this request.

BitPhinix avatar May 09 '25 22:05 BitPhinix

Hey, this could be an interesting option as model creators expand on that function.

Could you help refresh how enable_thinking help for training? Shouldn't the dataset already contain the thinking traces?

NanoCode012 avatar May 13 '25 05:05 NanoCode012

enable_thinking is by default True when using apply_chat_template. That means axolotl is basically incompatible with training Qwen3 as a non-thinking model, which may be desirable for a lot of use-cases where you don't have thinking data and can't get it. Do you think you could help introduce this parameter? @NanoCode012

casper-hansen avatar May 17 '25 17:05 casper-hansen

@casper-hansen , thanks for that refresher. In this case, would a better default be False? I can look into allowing chat_template kwargs

NanoCode012 avatar May 19 '25 09:05 NanoCode012

@NanoCode012 I'm not sure of the internals in axolot, but a good check is to figure out where/if apply_chat_template is used and then allow chat_template kwargs.

casper-hansen avatar May 19 '25 09:05 casper-hansen

I linked a draft PR for this. If either of you are interested in testing this, please do give it a try.

NanoCode012 avatar May 20 '25 09:05 NanoCode012