Yu FranzKafka

Results 56 comments of Yu FranzKafka

> Note that using a separate build dir and completely removing it before invoking `cmake` is highly recommended, because `cmake` will otherwise leave cruft that can cause subsequent runs to...

related issue:https://github.com/google-ai-edge/mediapipe/issues/5570 from MediaPipe official website.it says we can use AI Edge Torch to convert Gemma2-2b to suitable format but there are no more details: ![image](https://github.com/user-attachments/assets/785a1a12-5358-4be6-bbb6-286fa4a7f4a0) If MediaPipe Python Convert...

> Hi @FranzKafkaYu, I am currently looking into running Gemma 2 on AI Edge. Would it be possible to verify the source of the referenced image? > > Came across...

> I think the documentation should mention about https://github.com/google-ai-edge/ai-edge-torch/blob/main/ai_edge_torch/generative/examples/gemma/convert_gemma2_to_tflite.py GOOD,I will try this script and see whether we can go to the next step

thanks for your reply,so sad to hear that.so what's the major difference between mllm and other projects such as llama.cpp/MNN/MediaPipe etc.

> > [@hi-guy](https://github.com/hi-guy) 根据你使用qwen-vl-max的log,我观察发现真实坐标和模型预测坐标存在一个倍数关系,如果你使用qwen-vl-max通过dashscope,则会默认使用低分辨率模式,图片分辨率会以一个固定的倍数压缩,导致预测的坐标也随之被压缩。 > > 如果你要使用qwen-vl-max,解决方案有两个: > > > > 1. 多跑几个场景,算一下真实坐标和预测坐标之间的倍数,然后在坐标上乘这个倍数即可 > > 2. 打开dashscope调用的高分辨率模式,可以参考[文档](https://help.aliyun.com/zh/model-studio/vision#e7e2db755f9h7) > > > > [@wufannet](https://github.com/wufannet) 如果你也使用了qwen-vl-max也可以试一下这个方案。另外GUI-Owl的坐标预测我们正在使用7B模型复现结果,推测坐标出错可能是由于量化模型的问题。 > > [@junyangwang0410](https://github.com/junyangwang0410) 建议更新v3的readme将这些量化坐标不可用情况的tips说下,因为我是笔记本,所以一开始就会优先采用量化方案和阿里云api方案.最后不可用才会考虑找GPU找云服务部署全量方案. 你好,请教一下云服务部署的话有没有经济适用的平台呢,想用Mobile Agent跑一些自动化任务,有没有部署的教程可供借鉴