Jason J
Jason J
--max_trajectory_length 15 启动的参数加上这个就可以了
@ZFish-Lu It seems that the problem is caused by the input limit of VLM. I adjusted the size of the input history_n and the problem was solved. Maybe the problem...
> 在OSWorld的测评代码uitars_agent.py中,当observation_type为screenshot的时候出现bug:Invalid observation_type type: screenshot 应该是这部分代码有些问题 > > 我是直接设置的observation的type为screenshot_a11_tree,在predict构建prompt的时候只传入了screenshot
> 你现在能成功复现吗? 我这边测试特别慢,结果还没出来
> > 运行OSWorld的时候遇到如下bug,无论是vllm部署还是基于transformers部署都会出现如下错误,部署端日志显示应该是传的message格式有问题 > > > > [server.log](https://github.com/user-attachments/files/19961586/server.log) > > 是评测代码里message拼接有问题,已成功运行 是的,我之前也是这个问题,应该写成dict的格式
@manmushanhe Hello, I also encountered some problems when reproducing the sales results of this paper. Can you share your test code? Thank you very much.
@manmushanhe Thank you very much for sharing the test code. It will be very helpful for me.
还有action space定义也是不一样,osworld当中的prompt定义是box的坐标,而ui-tars定义的是一个点的坐标,使用osworld当中的prompt返回还是一个坐标点而不是box(两个坐标)