Macro averaging mode added for computing the F0.5 score.

Open zbeloki opened this issue 1 year ago • 0 comments

When computing the F0.5 score, macro-averaging is sometimes preferred over micro-averaging. Currently, Errant only supports micro-averaging by default, which is suitable when the test set is derived from real texts. However, if the test set consists of manually crafted sentences with grammatical errors, often grouped by error types, it is often preferable for all error types to contribute equally to the overall F-score.

This pull request introduces a new argument, f_average, to compare_m2.py. It accepts two values: micro (default) and macro.

Sep 20 '24 12:09 zbeloki