Jiayi Pan

Results 21 comments of Jiayi Pan

A nice way to improve open-source LLMs is by fine-tuning them with trajectories from stronger models like GPT-4. Bonus point if we can filter out the bad ones. One way...

> Thanks @Jiayi-Pan!! All of the bullet points mentioned are actually on our roadmap :)) Amazing and thanks for the pointer! I will have a look and see what I...

Hi friends, We’ve got AutoUI running and tested its end-to-end performance in our recent paper. You can find the inference code here https://github.com/Berkeley-NLP/Agent-Eval-Refine/tree/main/exps/android_exp/models/Auto-UI

happy to take care of the logging stuff (state, action, agent input/output, meta data, ...), will try to finish this in a few days. plan to do more after nips...

I think it's mostly information seeking. Besides, the benchmark covers a wide range of difficulties and other scenarios. So we don't need to worry about getting 0 score lol

Thanks everyone for the discussion! And thanks to @li-boxuan for fixing the agent hang bug. After a few more bug fixes, I believe the Gaia evaluation harness is now pretty...

One interesting thing I discovered during testing a DOCX understanding question is that Open-Dev’s agent has a sufficiently broad action space allowing the agent to develop multi-modal understanding skill by...

thanks @xingyaoww for the review. I've addressed all the issues.

This is awesome! This looks like a good starting point to get multi-modal understanding / gaia agent started. Kind of horizontal to this, I do wonder if we should also...