evaluate icon indicating copy to clipboard operation
evaluate copied to clipboard

Feature request for metric result keys in `MetricInfo`

Open tybrs opened this issue 2 years ago • 0 comments

Problem

If a user would like to automate the retrieval of metrics, she needs to know the result key at least or possible keys. For example, suppose I have an EvaluateSuite that computes "accuracy" then "precision". If I have a procedure downstream that needs to iterate and retrieve results, there does not seem to be a programmatic way to guarantee the procedure can retrieve the results for each metric without hardcoding. Often, result keys are the same at the metric name and some metrics have multiple result keys (i.e. glue returns "f1" with "accuracy" on text-classification tasks).

Possible Implementation

Add result_features attribute to MetricInfo, then a procedure can expect metric to return feature when key in result_features is not None.

evaluate.MetricInfo(
    description=_DESCRIPTION,
    citation=_CITATION,
    inputs_description=_KWARGS_DESCRIPTION,
    features=datasets.Features(
        {
            "predictions": datasets.Value("int64" if self.config_name != "stsb" else "float32"),
            "references": datasets.Value("int64" if self.config_name != "stsb" else "float32"),
        }
    ),
  result_features=datasets.Features(
          {
              "accuracy": datasets.Value("float32") if self.config_name  ["sst2", "mnli", "mnli_mismatched", 
                  "mnli_matched", "qnli", "rte", "wnli", "hans"] else None,
              "f1": datasets.Value("float32") if self.config_name  ["mrpc", "qqp"] else None,
              "pearson": datasets.Value("float32")  if self.config_name == "cola" else None,
              "spearmanr": datasets.Value("float32") if self.config_name  == "stsb" else None,
          }
      ),
    codebase_urls=[],
    reference_urls=[],
    format="numpy",
)

tybrs avatar Sep 07 '23 19:09 tybrs