AppAgent
AppAgent copied to clipboard
How is the reward mode designed?
In the paper Section 4.2 Reward, you said that you developed a reward model to assess the performance by calculating the similarity between the final UI page and the object UI page.
I wonder how the reward model is designed and trained. And would the reward model be released?
这个项目视乎被抛弃了,换老外的吧
这里有一篇他们的论文,也许能帮到你 https://arxiv.org/pdf/2312.13771.pdf