evaluate
evaluate copied to clipboard
Cannot use f1/recall/precision arguments in `CombinedEvaluations.compute`
This works:
metric=evaluate.load('f1')
metric.compute(references=[0, 1, 0, 1, 0], predictions=[0, 0, 1, 1, 0], average=None)
This won't work:
metric=evaluate.combine(["f1"])
metric.compute(references=[0, 1, 0, 1, 0], predictions=[0, 0, 1, 1, 0], average=None)
Reason:
average
is not included in f1 score features: https://github.com/huggingface/evaluate/blob/eaf34a7d04e7ab3e6155a046f6d7fda01d9ead84/metrics/f1/f1.py#L112
and CombinedEvaluations.compute
ignore if argument is not included in features: https://github.com/huggingface/evaluate/blob/eaf34a7d04e7ab3e6155a046f6d7fda01d9ead84/src/evaluate/module.py#L858
Is this expected or a bug?
@lvwerra do you have any opinion on that?
Yes, this is a current limitation of combine: you can't pass any settings to compute
only the features. Rather than fixing this in combine
we aim to enable changing the settings when the metric is loaded in #169. This should be coming in the next few weeks.
@lvwerra Do you mind if I ask whether you have an estimate on when this issue will be closed? I reviewed #188 and noticed that it seems to be passing for pip release. Is there a chance we can git it in the next couple of days?
@lvwerra I also need to pass custom kwargs to my metrics. My current workaround is to overwrite CombinedEvaluations .compute
in a child class and remove
https://github.com/huggingface/evaluate/blob/eaf34a7d04e7ab3e6155a046f6d7fda01d9ead84/src/evaluate/module.py#L858
This works in my specific use case because all my (custom) metrics accept **kwargs
in their compute
method, which means they will just ignore extra args.
Has this been resolved?
Not yet. There is an issue with the sync mechanism between the hub and the library which is why we had to roll back #169. Merging it will break pre 0.3.0 installs so we need to wait for sufficient adaptation.