fengji.zhang

Results 9 issues of fengji.zhang

Hi there! CodeRL is a brilliant idea, thanks for the effort! I have also dealt with the APPS dataset, and I found it hard to extract example test cases in...

Nice work! Interested in the design of 1 vs 1 battles between LVLMs, but can you share more details about the Elo rating algorithm? Like the choice of k-factor, the...

![image](https://user-images.githubusercontent.com/22430500/140607068-01837cd2-9bcc-4789-95ad-a5d7ad6c5417.png) Hi! I am trying to reproduce your code and come into a problem when I try to rebuild the pathminer kotlin project. Here is a package named astminer, but...

### 是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this? - [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions ### 该问题是否在FAQ中有解答? | Is there an...

Congratulations on the impressive work! I would like to suggest expanding the evaluation of visual reasoning to the **HumanEval-V** benchmark. This benchmark provides a more challenging set of tasks by...

### Motivation I would like to suggest expanding the evaluation of visual reasoning to the **HumanEval-V** benchmark. This benchmark provides a more challenging set of tasks by introducing **complex diagrams**...

I want to reproduce the evaluation pipeline for APPS, while it seems the `../data/apps_metric` invoked in the `test_apps.py` has been removed. How am I supposed to run the evaluation for...

### 起始日期 | Start Date _No response_ ### 实现PR | Implementation PR _No response_ ### 相关Issues | Reference Issues _No response_ ### 摘要 | Summary Congratulations on the impressive work!...

question

Congratulations on the impressive work! I would like to suggest expanding the evaluation of visual reasoning to the **HumanEval-V** benchmark. This benchmark provides a more challenging set of tasks by...