Graham Neubig comments

Results 874 comments of


                                            Graham Neubig

[Bug]: Agent finishes before tasks are fully complete

I agree that the LLM decides when to stop, but I think we need to have a better mechanism to double-check that the user's task has actually been fully completed.

[Bug]: Agent finishes before tasks are fully complete

Also highly related to #2221

add Visual SWE-bench benchmark

Hey @luolin101 , did you try running OpenHands on the benchmark and see what the accuracy was?

[Bug]: litellm.BadRequestError on interactive browsing

I am encountering the same error with the most recent version of the OpenHands cloud. This seems to get triggered with interactive browsing.

[Bug]: litellm.BadRequestError on interactive browsing

@openhands do a root cause analysis of this error and try to write a test to reproduce the error. the test must FAIL to demonstrate that the error exists, it...

[Bug]: litellm.BadRequestError on interactive browsing

@openhands It seems that this error is resulting from empty image URLs, for instance: ``` {'type': 'image_url', 'image_url': {'url': ''}} ``` Please clone the litellm github repo and read both...

[Bug]: litellm.BadRequestError on interactive browsing

This should be closed and will be included in the 0.45.0 release I believe

feature: Condenser Interface and Defaults

(I asked OpenHands to resolve the merge conflicts so that we can check to see if CI is passing and it seems like it's not, so that will also have...

feature: Condenser Interface and Defaults

When this has no merge conflicts, passing tests, and is ready for review could you re-request my review through github by pressing the "cycle" button? Thanks!

Display API costs in frontend

Based on a follow-up discussion with @rbren , we should maybe have statistics put in a separate window to prevent clutter in the main interface. But the window could be...