OpenHands icon indicating copy to clipboard operation
OpenHands copied to clipboard

[Bug]: o4-mini and o3-mini display no thoughts

Open neubig opened this issue 7 months ago • 9 comments

Is there an existing issue for the same bug? (If one exists, thumbs up or comment on the issue instead).

  • [x] I have checked the existing issues.

Describe the bug and reproduction steps

When using OpenHands with o4-mini or o3-mini, they display no thoughts in the frontend.

Image

This is confusing to users, who can not tell why the agent did what it did.

Thanks @kentyman23 for pointing this out.

OpenHands Installation

Docker command in README

OpenHands Version

No response

Operating System

None

Logs, Errors, Screenshots, and Additional Context

No response

neubig avatar May 19 '25 00:05 neubig

NOTABUG: OpenAI Responses API (unlike the ChatGPT Web App API) does not expose chains of thoughts in the returned response.

erkinalp avatar May 19 '25 06:05 erkinalp

I know the openAI API does not reveal the internal thoughts of the model, but from a user experience perspective we want to have an explanation of what the model is doing so the users can follow along. We need to find a way to fix this with o4 mini

neubig avatar May 19 '25 12:05 neubig

The only way would be to use ChatGPT Web App API, but not even that shows raw CoT (OpenAI o-series models aren't trained to produce human-readable chain of thought responses, the reasoning trace shown in the Web UI is generated by a separate language model translating from ChatGPT gibberish into your language)

erkinalp avatar May 19 '25 15:05 erkinalp

I'm confused. In Think->Act->Observe, aren't we supposed to see what it observed (but not the reasoning trace on how it came up with that observation)? As is, these models are basically unusable because they go off the rails without any idea why.

kentyman23 avatar Jun 17 '25 20:06 kentyman23

I think the issue is that o3 and o4-mini have internal thoughts that they don't show to users.

neubig avatar Jun 17 '25 22:06 neubig

I think the issue is that o3 and o4-mini have internal thoughts that they don't show to users.

Yes, but I guess I'm surprised there aren't external observations to show the users. Aren't those missing, too?

kentyman23 avatar Jun 18 '25 13:06 kentyman23

@kentyman23 those are not exposed in the responses API, they are only exposed in ChatGPT web and internal APIs

erkinalp avatar Jun 18 '25 14:06 erkinalp

If nothing else, maybe the interface should recognize that you're using such a model and give some sort of explanation of expectations. Otherwise, I feel like other might think things are broken.

kentyman23 avatar Jun 18 '25 16:06 kentyman23

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] avatar Jul 19 '25 02:07 github-actions[bot]

NOTABUG

erkinalp avatar Jul 19 '25 08:07 erkinalp

GPT-OSS-120B has a human-readable chain of thought, but performs worse than the hosted version due to the post training required to make the chain of thought readable and policy-following.

erkinalp avatar Aug 02 '25 19:08 erkinalp

This issue is stale because it has been open for 40 days with no activity. Remove the stale label or leave a comment, otherwise it will be closed in 10 days.

github-actions[bot] avatar Sep 12 '25 02:09 github-actions[bot]

I still think OpenHands can make this clearer to the user.

kentyman23 avatar Sep 12 '25 16:09 kentyman23

This issue is stale because it has been open for 40 days with no activity. Remove the stale label or leave a comment, otherwise it will be closed in 10 days.

github-actions[bot] avatar Oct 23 '25 02:10 github-actions[bot]

This issue was automatically closed due to 50 days of inactivity. We do this to help keep the issues somewhat manageable and focus on active issues.

github-actions[bot] avatar Nov 02 '25 02:11 github-actions[bot]