Junyang Wang comments

Results 123 comments of


                                            Junyang Wang

Mobile V3大概多久可以使用？

> 报异常：FileNotFoundError: [Errno 2] No such file or directory: 'C:\Users\DAIJ\.cache\modelscope\hub\._____temp\AI-ModelScope\GroundingDINO\groundingdino/**init**.py'请问是哪里出问题了 https://github.com/X-PLUG/MobileAgent/commit/e93345e16559a38f61c0bee0fb4ac0cc4d503f3c

perception_infos index out of range

This is because the previously executed program was unexpectedly interrupted and the cache was not cleared. Please try to clear the temp path under the execution path or delete all...

MobileAgent-v3 怎样引用？

> MobileAgent-v3 怎样引用？这里是引用v2 https://github.com/modelscope/modelscope-agent/blob/master/apps/mobile_agent/run.py v3 有上面的run.py的demo代码么？感谢关注Mobile-Agent-v3。Mobile-Agent-v3因为涉及到了训练，因此有一些个人隐私的敏感数据问题，模型现在还需要做安全性评估，可能要一段时间。另外就是我们还在持续优化模型的性能、速度和规模，争取等到发布的时候体验能比现在更好。欢迎star我们的代码仓库来获取最新进展。

MobileAgent-v3 怎样引用？

> 演示视频里面点外卖有手机号码，要打个码吧@junyangwang0410 感谢提醒，我们检查一下

Memory issue

If your task does not require remembering task-related content, the agent will not add anything to the memory unit. This has the benefit of reducing the interference of irrelevant information.

请教一下memory unit

> 想问一下从代码上看好像planning agent和reflection agent的输入好像是没有memory参与的？还有就是看你的demo运行速度很快，但实际我尝试调用api的时候一步就需要7，8秒的样子？ planning agent和reflection agent的memory是从decision agent的输出获取的。 demo有加速处理，并且剪掉了等待相应的内容

请教一下memory unit

> 拿reflection agent做例子,它的prompt输入是decision agent的summary,action。意思是memory也是从这两个获取的吗？是的。memory是decision agent做决策需要的，对于其他agent需要转换为agent所需的内容。例如reflection agent需要的操作意图，也就是decision agent的部分输出，是可能会用到memory中的内容的。

代码错误

> in_coordinate, out_coordinate = det(image, "icon", groundingdino_model) 这里返回两个值，但是方法只返回了一个值，是不是有错误啊，代码在Mobile-Agent/run.py 的149行 def det(input_image_path, caption, groundingdino_model, box_threshold=0.05, text_threshold=0.5): image = Image.open(input_image_path) size = image.size > > ``` > caption = caption.lower() > caption...

TypeError: annotate() got an unexpected keyword argument 'labels'

> 辛苦看看下面这个报错原因是什么呢？Python版本 3.9.13，系统版本：windows 10 Traceback (most recent call last): File "D:\Project\script\MobileAgent-main\Mobile-Agent-v2\run.py", line 286, in perception_infos, width, height = get_perception_infos(adb_path, screenshot_file) File "D:\Project\script\MobileAgent-main\Mobile-Agent-v2\run.py", line 190, in get_perception_infos coordinates = det(screenshot_file,...

增加根据xml的clickable属性判断是否可以点击

> 当手机界面存在几个相同的文字的时候首先利用xml的clickable属性判断是否可以点击从而进行筛选，可以让手机控制更加精准您好。如果使用的是安卓手机，是可以通过XML提升定位准确率的。但这种方法不适用于所有平台的手机。我们希望Mobile-Agent能面向手机GUI而不仅仅是特定的操作系统，因此采用了全平台通用的纯视觉方案。