OpenHands icon indicating copy to clipboard operation
OpenHands copied to clipboard

[Agent] Implement Critic Model

Open jpelletier1 opened this issue 7 months ago • 5 comments

What problem or use case are you trying to solve? This idea of using choosing the best of multiple solutions has been tried by other SWE-bench submissions, but these strategies were generally based on prompting an existing model like Claude. Rather than using this prompt-based reranking strategy, we trained a dedicated critic model, which we found provided more effective results.

The goal of this issue is to implement a critic model in OpenHands.

Additional context Read more about the OpenHands Critic model here: https://www.all-hands.dev/blog/sota-on-swe-bench-verified-with-inference-time-scaling-and-critic-model

If you find this feature request or enhancement useful, make sure to add a 👍 to the issue

jpelletier1 avatar Jun 06 '25 19:06 jpelletier1

@xingyaoww One scenario that has come up recently is OpenHands generating more code than it needs to for a particular task. Is this the type of thing a Critic Model could help with?

jpelletier1 avatar Sep 18 '25 13:09 jpelletier1

☝️ i've been thinking a lot about it lately, there should be two types of critic (but ideally with the same interface):

  • focus on trajectory process: if the agent is solving the problem with the correct process (e.g., use the right tool, do the right thing)
  • outcome-based: simply look at the output patch and judge based on that

I'm considering building a more "unified" version of the critic that can potentially do both.

We will likely implement this inside agent-sdk, maybe we can move this issue there?

xingyaoww avatar Sep 18 '25 20:09 xingyaoww

@jpelletier1 is this agent-sdk now?

mamoodi avatar Nov 06 '25 18:11 mamoodi

@xingyaoww I'm assuming this ticket stays within the OpenHands/OpenHands project right?

jpelletier1 avatar Nov 11 '25 15:11 jpelletier1

@jpelletier1 yes! we will experiment with it in SDK and eventually integrate it here

xingyaoww avatar Nov 11 '25 16:11 xingyaoww