MoTCoder icon indicating copy to clipboard operation
MoTCoder copied to clipboard

How to reproduce the results in the paper.

Open xssstory opened this issue 1 year ago • 3 comments

Hello, I have downloaded the released model and followed the inference command you provided.

However, it seems that the strict accuracy is not matched with the number you reported in the paper.

My inference command is:

python  src/inference.py JingyaoLi/MoTCoder-15B-v1.0/ apps/test.jsonl ./output/generation.jsonl FORMAT_PROMPT

After evaluation, the accuracy on the competition level is:

img_v3_026k_87a2bf54-f748-4082-8fef-13683fe91ddg

Could you please help me do inference correctly?

xssstory avatar Jan 02 '24 07:01 xssstory

Hi, our reported pass@1 is the average/normalized pass@1. You can refer to this benchmark paper for the detailed metric definition.

JulietLJY avatar Jan 03 '24 09:01 JulietLJY

Thanks for your reply!

I noticed that the pass@1 and pass@5 of GPT-Neo (Tab.4 in your paper) are strict accuracies.

I believe it would be better to report the numbers using the consistent metric in Tab.4.

xssstory avatar Jan 05 '24 08:01 xssstory

Thank you for bringing this to our attention. We have verified that what you mentioned is correct. Due to an error in reporting in line with previous work, we have inaccuracies in the performance metrics of competitive methods in the paper. We will rectify our mistake as soon as possible.

JulietLJY avatar Jan 05 '24 09:01 JulietLJY