Do models really answer directly in the non-CoT setting?

Open itsnamgyu opened this issue 9 months ago • 1 comments

I noticed that there is not a big difference between non-CoT and CoT performance. I'm curious if the models are in fact answering directly, without any intermediate reasoning, in the "non-CoT" setting.

I think many models will generate intermediate reasoning anyways, given the prompt template given in the paper:

Please read the following text and answer the question below.
<text>
{Long Context}
</text>
What is the correct answer to this question: {Question}
Choices:
(A) {Choice A}
(B) {Choice B}
(C) {Choice C}
(D) {Choice D}
Format your response as follows: “The correct answer is (insert answer here)”.

Do you have any insights on this, or statistics on the output lengths of various models?

Thanks

Mar 23 '25 07:03 itsnamgyu

Testing on Gemma 3 shows that the no-CoT prompt elicits direct answer, followed by explanation

Mar 26 '25 07:03 itsnamgyu