Evaluation index calculation

Open scar-on opened this issue 8 months ago • 1 comments

Dear author, when I was evaluating llama3.1, for the longbook_qa_eng task, the evaluation results made me very confused. The results were completely consistent, but the f1 value was missing and was 0.

Mar 27 '25 14:03 scar-on

it seems that <|eot_id|> is a special token, and you can pre-process it

May 14 '25 03:05 tuantuanzhang