MobileAgent issues

Qwen API

3

Hello. Thank you for your great research. I want to use mobileagentv2, and I'm wondering if the qwen API is essential for using it. Since i'm an international, it seems...

shp216

增加根据xml的clickable属性判断是否可以点击

2

当手机界面存在几个相同的文字的时候首先利用xml的clickable属性判断是否可以点击从而进行筛选，可以让手机控制更加精准

kx-kexi

in_coordinate, out_coordinate = det(image, "icon", groundingdino_model) 这里返回两个值，但是方法只返回了一个值，是不是有错误啊，代码在Mobile-Agent/run.py 的149行 def det(input_image_path, caption, groundingdino_model, box_threshold=0.05, text_threshold=0.5): image = Image.open(input_image_path) size = image.size caption = caption.lower() caption = caption.strip() if not caption.endswith('.'): caption...

kx-kexi

请教一下memory unit

3

想问一下从代码上看好像planning agent和reflection agent的输入好像是没有memory参与的？还有就是看你的demo运行速度很快，但实际我尝试调用api的时候一步就需要7，8秒的样子？

ouyx189

GPT-4o的API_url是必填的吗

4

您好，我看到在v2的run.py中，需要填写GPT-4o的API_url和token，这两个参数是必须的吗？是不是使用了Qwen的qwen_api，就不需要GPT-4o的了？

yiwenliu

这是不是没有成功，是怎么回事呀

6

![e550a0cb-5456-434b-9779-c0e9aea92c08](https://github.com/X-PLUG/MobileAgent/assets/134773327/78afe4d5-2eca-49fa-ae36-fda2db1b3768)

haoiwang

找不到Screenshot目录下的screenshot.jpg文件。

7

2024-06-07 11:28:54,386 - modelscope - INFO - loading model done ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ D:\github-app\MobileAgent\Mobile-Agent-v2\run.py:286 in │ │ │ │ 283 │ iter += 1 │...

lxc00215

请教大佬，PC-Agent中gpt-4o进行对话的部分，能否换成本地部署的Qwen-VL-Chat？

7

![image](https://github.com/user-attachments/assets/5aaa3655-2e50-4649-b12d-6b323ff02444) 图片中标注的那部分能够换成千问

shenyugub

如何让agent去点击这个复选框，完成勾选？

2

![image](https://github.com/user-attachments/assets/2c0dda65-6d44-4136-a672-f0992b06be35) 尝试这样表达 "点击[我已阅读并同意]左边的复选框，完成勾选，" 但是ocr识别似乎是以“点”为主，始终点不到这个复选框 ![image](https://github.com/user-attachments/assets/39f24ceb-5b49-428f-b8c0-e98eb3fb075a)

herist

PC-Agent可以使用qwen-vl吗？

PC-Agent可以使用国产模型吗？如果可以，代码在哪里改？

balangbalang

MobileAgent
MobileAgent copied to clipboard

Metadata

Qwen API

增加根据xml的clickable属性判断是否可以点击

代码错误

请教一下memory unit

GPT-4o的API_url是必填的吗

这是不是没有成功，是怎么回事呀

找不到Screenshot目录下的screenshot.jpg文件。

请教大佬，PC-Agent中gpt-4o进行对话的部分，能否换成本地部署的Qwen-VL-Chat？

如何让agent去点击这个复选框，完成勾选？

PC-Agent可以使用qwen-vl吗？

← Metadata

Owner

Metadata

MobileAgent MobileAgent copied to clipboard

Metadata

← Metadata

Owner

Metadata

MobileAgent
MobileAgent copied to clipboard