luziyu comments

Results 8 comments of


                                            luziyu

OSWorld cannot parse the response

I've also encountered the same problem.

OSWorld cannot parse the response

在OSWorld的测评代码uitars_agent.py中，当observation_type为screenshot的时候出现bug：Invalid observation_type type: screenshot 应该是这部分代码有些问题

OSWorld cannot parse the response

运行OSWorld的时候遇到如下bug，无论是vllm部署还是基于transformers部署都会出现如下错误，部署端日志显示应该是传的message格式有问题 [server.log](https://github.com/user-attachments/files/19961586/server.log)

OSWorld cannot parse the response

> [@ZFish-Lu](https://github.com/ZFish-Lu) It seems that the problem is caused by the input limit of VLM. I adjusted the size of the input history_n and the problem was solved. Maybe the...

OSWorld cannot parse the response

> 运行OSWorld的时候遇到如下bug，无论是vllm部署还是基于transformers部署都会出现如下错误，部署端日志显示应该是传的message格式有问题 > > > [server.log](https://github.com/user-attachments/files/19961586/server.log) 是评测代码里message拼接有问题，已成功运行

OSWorld评测细节

> > 您好，请问能公开一下在OSWorld评测的更多参数细节吗？我按照如下命令设置参数，其余都使用代码里的默认参数，但复现出来的uitars-1.5-7b准确率只有21.2%，达不到你们公布的26.9%，请问是跑多次取最优结果吗？ `--headless --observation_type screenshot --sleep_after_execution 0.5 --max_trajectory_length 100` > > 您好，请问您是否成功复现？我使用uitars-1.5-7b在max-step=15的情况下几乎不可用，是否可以分享您的测试代码？目前复现达到24%准确率，抱歉暂时无法分享代码，如果是我遇到过的bug能帮你看一下

OSWorld评测细节

> > > > 您好，请问能公开一下在OSWorld评测的更多参数细节吗？我按照如下命令设置参数，其余都使用代码里的默认参数，但复现出来的uitars-1.5-7b准确率只有21.2%，达不到你们公布的26.9%，请问是跑多次取最优结果吗？ `--headless --observation_type screenshot --sleep_after_execution 0.5 --max_trajectory_length 100` > > > > > > > > > 您好，请问您是否成功复现？我使用uitars-1.5-7b在max-step=15的情况下几乎不可用，是否可以分享您的测试代码？ > > > > > > 目前复现达到24%准确率，抱歉暂时无法分享代码，如果是我遇到过的bug能帮你看一下 >...

如何在本地部署这个模型并完成推理

同问，我用vllm部署跑OSWorld时，遇到了如下问题