evaluate Add support for two input columns for TextClassificationEvaluator

Add support for two input columns for TextClassificationEvaluator

Open fxmarty opened this issue 3 years ago • 4 comments

trafficstars

This PR:

Refactors the docstrings to avoid duplicates in the Evaluator subclasses
Put all arguments in the compute() signature for subclasses of Evaluator, so as to be able to call the method without keyword arguments (had the issue with issue classification since we redefined the default input_column="image")
Adds support for two inputs columns (as in glue/mnli) for TextClassificationEvaluator

@lvwerra I added a DatasetColumnTextClassification because of the input expected by the transformers TextClassificationPipeline: https://github.com/huggingface/transformers/blob/d0acc9537829e7d067edbb791473bbceb2ecf056/src/transformers/pipelines/text_classification.py#L109-L111 . I did not manage to use DatasetColumn for it, could we do better than what I did? It's a bit hacky here. Probably memory is not so much sensitive for text compared to images, so we could not use these wrappers altogether?

This stills misses a bit of documentation.

Partially closes https://github.com/huggingface/evaluate/issues/196

Jul 26 '22 15:07 fxmarty

The documentation is not available anymore as the PR was closed or merged.

Jul 27 '22 08:07 HuggingFaceDocBuilderDev

Looks great! Thanks for adding this, I left a a few minor comments. I agree with @ola13 that splitting such a PR in two would make it much easier to review (for next time) :)

Jul 28 '22 10:07 lvwerra

Hello, thank you for your feedback! I will be careful next to break PRs in unitary changes.

Aug 01 '22 07:08 fxmarty

@lvwerra @ola13 Feel free to re-review! Sorry for the lot of code change for a minimal new feature, will break it up in the future.

Aug 01 '22 08:08 fxmarty

evaluate evaluate copied to clipboard

Add support for two input columns for TextClassificationEvaluator

evaluate
evaluate copied to clipboard