Graham Neubig

Results 874 comments of Graham Neubig

I agree that the LLM decides when to stop, but I think we need to have a better mechanism to double-check that the user's task has actually been fully completed.

Hey @luolin101 , did you try running OpenHands on the benchmark and see what the accuracy was?

I am encountering the same error with the most recent version of the OpenHands cloud. This seems to get triggered with interactive browsing.

@openhands do a root cause analysis of this error and try to write a test to reproduce the error. the test must FAIL to demonstrate that the error exists, it...

@openhands It seems that this error is resulting from empty image URLs, for instance: ``` {'type': 'image_url', 'image_url': {'url': ''}} ``` Please clone the litellm github repo and read both...

This should be closed and will be included in the 0.45.0 release I believe

(I asked OpenHands to resolve the merge conflicts so that we can check to see if CI is passing and it seems like it's not, so that will also have...

When this has no merge conflicts, passing tests, and is ready for review could you re-request my review through github by pressing the "cycle" button? Thanks!

Based on a follow-up discussion with @rbren , we should maybe have statistics put in a separate window to prevent clutter in the main interface. But the window could be...