How to turn off the thinking mode

Open ccp123456789 opened this issue 1 month ago • 3 comments

How to turn off the thinking mode training if I don't want to use it with the Qwen3 series models?

Dec 01 '25 08:12 ccp123456789

So you want to remove the thinking tokens from the response for training? I think that will cause discrepency between training and inference.

Dec 02 '25 03:12 ultmaster

I'm also interested. There is a enable_thinking parameter in huggingface apply_chat_template function for Qwen3 models but I did not know where to pass it within agent-lightning/verl. I suppose that if we put the same value in training and inference this would not cause any discrepancy

Dec 02 '25 11:12 xavier-owkin

Has verl figured that out? If they haven't, we are unable to help despite we want to.

Dec 02 '25 12:12 ultmaster