How is the reward mode designed?

Open EthanLeo-LYX opened this issue 1 year ago • 2 comments

In the paper Section 4.2 Reward, you said that you developed a reward model to assess the performance by calculating the similarity between the final UI page and the object UI page. I wonder how the reward model is designed and trained. And would the reward model be released?

Apr 15 '24 12:04 EthanLeo-LYX

这个项目视乎被抛弃了，换老外的吧

Apr 17 '24 06:04 csdaa

这里有一篇他们的论文，也许能帮到你 https://arxiv.org/pdf/2312.13771.pdf

Apr 17 '24 06:04 csdaa